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" (57) Abstract: Described herein are genes whose expression are up-regulated or down-regulated in prostate cancer. Also described 
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and compositions that can be used for diagnosis and treatment of prostate cancer are disclosed. Also described herein are methods 
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METHODS OF DIAGNOSIS OF PROSTATE CANCER, 
COMPOSITIONS AND METHODS OF SCREENING FOR 
MODULATORS OF PROSTATE CANCER 

5 

CROSS-REFERENCES TO RELATED APPLICATIONS 

This application claims priority from the following applications: USSN 
09/687,576 filed October 13, 2000, USSN 60/276,791 filed March 16, 2001; USSN 
60/288,589, filed May 4, 2001 ; USSN 09/733,742, filed December 8, 2000; USSN 
10 09/733,288, filed December 8, 2000; USSN 09/847,046, filed April 30, 2001; USSN 
60/276,888, filed March 16, 2001; USSN 60/286,214, filed April 24, 2001; USSN 
60/281,922, filed April 6, 2001; USSN 60/263,957, filed January 24, 2001, which are 
incorporated herein by reference in their entirety. 

15 FIELD OF THE INVENTION 

The invention relates to the identification of nucleic acid and protein 
expression profiles and nucleic acids, products, and antibodies thereto that are involved in 
prostate cancer, and to the use of such expression profiles and compositions in the diagnosis, 
prognosis and therapy of prostate cancer. Hie invention further relates to methods for 
20 identifying and using agents and/or targets that inhibit prostate cancer. 

BACKGROUND OF THE INVENTION 
Prostate cancer is the most commonly diagnosed internal malignancy and 
second most common cause of cancer death in men in the U.S., resulting in approximately 
25 40,000 deaths each year ( Landis et al., CA Cancer J, Clin. 48:6-29 (1998); Greenlee et al., 
CA Cancer J. Clin. 50(1):7-13 (2000)), and incidence of prostate cancer has been increasing 
rapidly over the past 20 years in many parts of the world (Nakata et al., Int. J. Urol 
7(7):254-257 (2000); Majeed et al., BJUInt. 85(9): 1058-1062 (2000)). It develops as the 
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result of a pathologic transformation of normal prostate cells. In tumorigenesis, the cancer 
cell undergoes initiation, proliferation and loss of contact inhibition, culminating in invasion 
of surrounding tissue and, ultimately, metastasis. 

Deaths from prostate cancer are a result of metastasis of a prostate tumor. 
5 Therefore, early detection of the development of prostate cancer is critical in reducing 

mortality from this disease. Measuring levels of prostate-specific antigen (PSA) has become 
a very common method for early detection and screening, and may have contributed to the 
slight decrease in the mortality rate from prostate cancer in recent years (Nowroozi et al., 
Cancer Control 5(6):522-531 (1998)). However, many cases are not diagnosed until the 

10 disease has progressed to an advanced stage. 

Treatments such as surgery (prostatectomy) , radiation therapy, and 
cryotherapy are potentially curative when the cancer remains localized to the prostate. 
Therefore, early detection of prostate cancer is important for a positive prognosis for 
treatment. Systemic treatment for metastatic prostate cancer is limited to hormone therapy 

15 and chemotherapy. Chemical or surgical castration has been the primary treatment for 
symptomatic metastatic prostate cancer for over 50 years. This testicular androgen 
deprivation therapy usually results in stabilization or regression of the disease (in 80% of 
patients), but progression of metastatic prostate cancer eventually develops (Panvichian et al., 
Cancer Control 3(6):493-500 (1996)). Metastatic disease is currendy considered incurable, 

20 and the primary goals of treatment are to prolong survival and improve quality of life (Rago, 
Cancer Control 5(6):513-521 (1998)). 

Thus, methods that can be used for diagnosis and prognosis of prostate cancer 
and effective treatment of prostate cancer, and including particularly metastatic prostate 
cancer, would be desirable. Accordingly, provided herein are methods that can be used in 

25 diagnosis and prognosis of prostate cancer. Further provided are methods that can be used to 
screen candidate bioactive agents for the ability to modulate, e.g., treat, prostate cancer. 
Additionally, provided herein are molecular targets and compositions for therapeutic 
intervention in prostate cancer and other cancers. 
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SUMMARY OF THE INVENTION 
The present invention therefore provides nucleotide sequences of genes that 
J are up- and down-regulated in prostate cancer cells. Such genes are useful for diagnostic 
purposes, and also as targets for screening for therapeutic compounds that modulate prostate 
5 cancer, such as hormones or antibodies. Other aspects of the invention will become apparent 
to the skilled artisan by the following description of the invention. 

In one aspect, the present invention provides a method of detecting a prostate 
cancer-associated transcript in a cell from a patient, the method comprising contacting a 
biological sample from the patient with a polynucleotide that selectively hybridizes to a 
10 sequence at least 80% identical to a sequence as shown in Tables 1-16. 

In one embodiment, the present invention provides a method of determining 
the level of a prostate cancer associated transcript in a cell from a patient. 

In one embodiment, the present invention provides a method of detecting a 
prostate cancer-associated transcript in a cell from a patient, the method comprising 
15 contacting a biological sample from the patient with a polynucleotide that selectively 
hybridizes to a sequence at least 80% identical to a sequence as shown in Tables 1-16. 

In one embodiment, the polynucleotide selectively hybridizes to a sequence at 
least 95% identical to a sequence as shown in Tables 1-16. In another embodiment, the 
polynucleotide comprises a sequence as shown in Tables 1-16. 
20 In one embodiment, the biological sample is a tissue sample. In another 

embodiment, the biological sample comprises isolated nucleic acids, e.g., mRNA. 

In one embodiment, the polynucleotide is labeled, e.g., with a fluorescent 

label. 

In one embodiment, the polynucleotide is immobilized on a solid surface. 
25 In one embodiment, the patient is undergoing a therapeutic regimen to treat 

prostate cancer. In another embodiment, the patient is suspected of having metastatic 
prostate cancer. 

In one embodiment, the patient is a human. 

In one embodiment, the patient is suspected of having a taxol-resistant cancer. 
30 In one embodiment, the prostate cancer associated transcript is mRNA. 
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In one embodiment, the method further comprises the step of amplifying 
nucleic acids before the step of contacting the biological sample with the polynucleotide. 

In another aspect, the present invention provides a method of monitoring the 
efficacy of a therapeutic treatment of prostate cancer, the method comprising the steps of: (i) 
5 providing a biological sample from a patient undergoing the therapeutic treatment; and (ii) 
determining the level of a prostate cancer-associated transcript in the biological sample by 
contacting the biological sample with a polynucleotide that selectively hybridizes to a 
sequence at least 80% identical to a sequence as shown in Tables 1-16, thereby monitoring 
the efficacy of the therapy. In a further embodiment, the patient has metastatic prostate 
10 cancer. In a further embodiment, the patient has a drug resistant (e.g., taxol resistant) form of 
prostate cancer. 

In one embodiment, the method further comprises the step of: (iii) comparing 
the level of the prostate cancer-associated transcript to a level of the prostate cancer- 
associated transcript in a biological sample from the patient prior to, or earlier in, the 

15 therapeutic treatment. 

Additionally, provided herein is a method of evaluating the effect of a • 
candidate prostate cancer drug comprising administering the drug to a patient and removing a 
cell sample from the patient. The expression profile of the cell is then determined. This 
method may further comprise comparing the expression profile to an expression profile of a 

20 healthy individual. In a preferred embodiment, said expression profile includes a gene of 
Tables 1-16. 

In one aspect, the present invention provides an isolated nucleic acid molecule 
consisting of a polynucleotide sequence as shown in Tables 1-16. 

In one embodiment, an expression vector or cell comprises the isolated nucleic 

25 acid 

In one aspect, the present invention provides an isolated polypeptide which is 
- encoded by a nucleic acid molecule having polynucleotide sequence as shown in Tables 1-16. 

In another aspect, the present invention provides an antibody that specifically 
binds to an isolated polypeptide which is encoded by a nucleic acid molecule having 
30 polynucleotide sequence as shown in Tables 1-16. 
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In one embodiment, the antibody is conjugated to an effector component, e.g., 
a fluorescent label, a radioisotope or a cytotoxic chemical. 

In one embodiment, the antibody is an antibody fragment. In another 
embodiment, the antibody is humanized. 
5 In one aspect, the present invention provides a method of detecting a prostate 

cancer cell in a biological sample from a patient, the method comprising contacting the 
biological sample with an antibody as described herein. 

In another aspect, the present invention provides a method of detecting 
antibodies specific to prostate cancer in a patient, the method comprising contacting a 
10 biological sample from the patient with a polypeptide encoded by a nucleic acid comprising a 
sequence from Tables 1-16. 

In another aspect, the present invention provides a method for identifying a 
compound that modulates a prostate cancer-associated polypeptide, the method comprising 
the steps of: (i) contacting the compound with a prostate cancer-associated polypeptide, the 
15 polypeptide encoded by a polynucleotide that selectively hybridizes to a sequence at least 
80% identical to a sequence as shown in Tables 1-16; and (ii) determining the functional 
effect of the compound upon the polypeptide. 

In one embodiment, the functional effect is a physical effect, an enzymatic 
effect, or a chemical effect. 
20 In one embodiment, the polypeptide is expressed in a eukaryotic host cell or 

cell membrane. In another embodiment, the polypeptide is recombinant. 

In one embodiment, the functional effect is determined by measuring ligand 
binding to the polypeptide. 

In another aspect, the present invention provides a method of inhibiting 
25 proliferation of a prostate cancer-associated cell to treat prostate cancer in a patient, the 

method comprising the step of administering to the subject a therapeutically effective amount 
of a compound identified as described herein. 

In one embodiment, the compound is an antibody. 
In another aspect, the present invention provides a drug screening assay 
30 comprising the steps of: (i) administering a test compound to a mammal having prostate 

cancer or to a cell sample isolated therefrom; (ii) comparing the level of gene expression of a 

5 
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polynucleotide that selectively hybridizes to a sequence at least 80% identical to a sequence 
as shown in Tables 1-16 in a treated cell or mammal with the level of gene expression of the 
polynucleotide in a control cell sample or mammal, wherein a test compound that modulates 
the level of expression of the polynucleotide is a candidate for the treatment of prostate 
5 cancer. 

* In one embodiment, the control is a mammal with prostate cancer or a cell 
sample therefrom that has not been treated with the test compound In another embodiment, 
the control is a normal cell or mammal. 

In one embodiment, the test compound is administered in varying amounts or 
10 concentrations. In another embodiment, the test compound is administered for varying time 
periods. In another embodiment, the comparison can occur after addition or removal of the 
drug candidate. 

In one embodiment, the levels of a plurality of polynucleotides that selectively 

hybridize to a sequence at least 80% identical to a sequence as shown in Tables 1-16 are 
15 individually compared to their respective levels in a control cell sample or mammal. In a 

preferred embodiment the plurality of polynucleotides is from three to ten. 

In another aspect, the present invention provides a method for treating a 

mammal having prostate cancer comprising administering a compound identified by the 

assay described herein. 
20 In another aspect, the present invention provides a pharmaceutical 

composition for treating a mammal having prostate cancer, the composition comprising a 

compound identified by the assay described herein and a physiologically acceptable 

excipient 

In one aspect, the present invention provides a method of screening drug 
25 candidates by providing a cell expressing a gene that is up- and down-regulated as in a 
prostate cancer. In one embodiment, a gene is selected from Tables 1-16. The method 
further includes adding a drug candidate to the cell and determining the effect of the drug 
candidate on the expression of the expression profile gene. 

In one embodiment, the method of screening drug candidates includes 
30 comparing the level of expression in the absence of the drug candidate to the level of 
expression in the presence of the drug candidate, wherein the concentration of the drug 
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candidate can vary when present, and wherein the comparison can occur after addition or 
removal of the drug candidate. In a preferred embodiment, the cell expresses at least two 
expression profile genes. The profile genes may show an increase or decrease. 

Also provided is a method of evaluating the effect of a candidate prostate 

5 cancer drug comprising administering the drug to a transgenic animal expressing or 
over-expressing the prostate cancer modulatory protein, or an animal lacking the prostate 
cancer modulatory protein, for example as a result of a gene knockout 

Moreover, provided herein is a biochip comprising one or more nucleic acid 
segments of Tables 1-16, wherein the biochip comprises fewer than 1000 nucleic acid probes. 

10 Preferably, at least two nucleic acid segments are included More preferably, at least three 
nucleic acid segments are included. 

Furthermore, a method of diagnosing a disorder associated with prostate 
cancer is provided. The method comprises determining the expression of a gene of Tables 1- 
16, in a first tissue type of a first individual, and comparing the distribution to the expression 

15 of the gene from a second normal tissue type from the first individual or a second unaffected 
individual. A difference in the expression indicates that the first individual has a disorder 
associated with prostate cancer. 

In a further embodiment, the biochip also includes a polynucleotide sequence 
of a gene that is not up- and down-regulated in prostate cancer. 

20 In one embodiment a method for screening for a bioactive agent capable of 

interfering with the binding of a prostate cancer modulating protein (prostate cancer 
modulatory protein) or a fragment thereof and an antibody which binds to said prostate 
cancer modulatory protein or fragment thereof. In a preferred embodiment, the method 
comprises combining a prostate cancer modulatory protein or fragment thereof, a candidate 

25 bioactive agent and an antibody which binds to said prostate cancer modulatory protein or 
fragment thereof. The method further includes determining the binding of said prostate 
cancer modulatory protein or fragment thereof and said antibody. Wherein there is a change 
in binding, an agent is identified as an interfering agent. The interfering agent can be an 
agonist or an antagonist. Preferably, the agent inhibits prostate cancer. 

30 Also provided herein are methods of eliciting an immune response in an 

individual. In one embodiment a method provided herein comprises administering to an 
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individual a composition comprising a prostate cancer modulating protein, or a fragment 
thereof. In another embodiment, the protein is encoded by a nucleic acid selected from those 
of Tables 1-16. 

Further provided herein are compositions capable of eliciting an immune 
5 response in an individual. In one embodiment, a composition provided herein comprises a 
prostate cancer modulating protein, preferably encoded by a nucleic acid of Tables 1-16, or a 
fragment thereof, and a pharmaceutically acceptable carrier. In another embodiment, said 
composition comprises a nucleic acid comprising a sequence encoding a prostate cancer 
modulating protein, preferably selected from the nucleic acids of Tables 1-16, and a 

10 pharmaceutically acceptable carrier. 

Also provided are methods of neutralizing the effect of a prostate cancer 
protein, or a fragment thereof, comprising contacting an agent specific for said protein with 
said protein in an amount sufficient to effect neutralization. In another embodiment, the 
protein is encoded by a nucleic acid selected from those of Tables 1-16. 

15 In another aspect of the invention, a method of treating an individual for 

prostate cancer is provided. In one embodiment, the method comprises administering to said 
individual an inhibitor of a prostate cancer modulating protein. In another embodiment, the 
method comprises administering to a patient having prostate cancer an antibody to a prostate 
cancer modulating protein conjugated to a therapeutic moiety. Such a therapeutic moiety can 

20 be a cytotoxic agent or a radioisotope. 

DETAILED DESCRIPTION OF THE INVENTION 
In accordance with the objects outlined above, the present invention provides 
novel methods for diagnosis and prognosis evaluation for prostate cancer (PC), including 
25 metastatic prostate cancer, as well as methods for screening for compositions which modulate 
prostate cancer. Also provided are methods for treating prostate cancer. 

In addition to the other nucleic acid and peptide sequences, the present 
invention also relates to the identification of PAA2 as a gene that is highly over expressed in 
prostate cancer patient tissues. PAA2 sequence is identical to the zinc transporter ZNT4. 
30 Results presented herein demonstrate that PAA2/ZNT4 is highly expressed in prostate cancer 
cells. The prostate gland is unique in that it has the highest capacity of any organ in the body 
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to accumulate zinc. Zinc uptake is regulated by prolactin and testosterone, which induce the 
expression of a member of the ZIP family of zinc transporters (Costello et al., 1999, J. Biol. 
Chem. 274:17499-17504). Zinc accumulation in the prostate functions to inhibit citrate 
oxidation, which results in a decrease in cellular ATP production (Costello and Franklin, 
5 1998, Prostate 35:285-296). Cancer cells are more sensitive to decreased ATP production 
and have evolved to prevent zinc accumulation. Without wishing to be bound by theory, the 
up-regulation of ZNT4 in prostate cancer cells may result in protection of the cells from high 
zinc levels by its ability to pump accumulated zinc out of the cells. 

The present invention also relates to nucleic acid sequencess encoding PBH1 . 

10 PBH1 is related to human TRPC7 (transient receptor potential-related channels, NP_003298), 
a putative calcium channel highly expressed in brain (Nagamine et al., Genomics 54:124-131 
(1998)). Trp is related to melastatin, a gene down-regulated in metastatic melanomas 
(Duncan et al., Cancer Res. 58:1515-1520 (1998)), and MTR1, a gene locallized to within the 
Beckwith- Wiedemann syndrome/Wilm's tumor susceptability region (Prawitt et al., Hum. 

15 Mol. Genet. 9:203-216 (2000)). Without wishing to be bound by theory, it is believed that 
PBH1 functions as a calcium channel. 

As a calcium channel, PBH1 is an ideal target for a small molecule 
therapeutic, or a therapeutic antibody that disrupts channel function. CD20, the target of 
Rituximab in non-Hodgekin's lymphoma (Maloney et al., Blood 90:2188-2195 (1997); Leget 

20 and Czuczman, Curr. Opin. Oncol. 10:548-551 (1998)), is a plasma membrane calcium 
channel expressed in B cells (Tedder andEngel, Immunol. Today 15:450-454 (1994)). 
Similarly, a small molecule, or antibody that inhibits or alters a calcium signal mediated by 
PBH1, will result in the death of prostate cancer cells. 

PBH1, and other genes of the invention, are also be useful as targets for 
. 25 cytotoxic T-lymphocytes. Genes that are tumor specific, or that are expressed in immune- 
privileged organs, are currently being used as potential vaccine targets (Van den Eynde and 
Boon, Int. J, Clin. Lab. Res. 27:81-86 (1997)). The expression pattern of PBH1 indicates that 
it is an ideal target for cytotoxic T-lymphocytes. Thus, therapies that utilize PBHl-specific 
cytotoxic T-lymphocytes to induce prostate cancer cell death are also provided by this 

30 invention. See, e.g., U.S. Patent No. 6,051,227 and WO 00/32231, the disclosures of which 
are herein incorporated by reference. 
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The present invention is also related to the identification of PAA3 as a gene 
that is important in the modulation of prostate cancer and or breast cancer. 

Tables 1-16 provide unigene cluster identification numbers, exemplar 
accession numbers, or genomic nucleotide position numbers for the nucleotide sequence of 
5 genes that exhibit increased or decreased expression in prostate cancer samples. 

Definitions 

Hie term "prostate cancer protein" or "prostate cancer polynucleotide" or 
"prostate cancer-associated transcript" refers to nucleic acid and polypeptide polymorphic 

10 variants, alleles, mutants, and interspecies homologues that: (1) have a nucleotide sequence 
that has greater than about 60% nucleotide sequence identity, 65%, 70%, 75%, 80%, 85%, 
90%, preferably 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% or greater nucleotide 
sequence identity, preferably over a region of over a region of at least about 25, 50, 100, 200, 
500, 1000, or more nucleotides, to a nucleotide sequence of or associated with a unigene 

15 cluster of Tables 1-16; (2) bind to antibodies, e.g., polyclonal antibodies, raised against an 
immunogen comprising an amino acid sequence encoded by a nucleotide sequence of or 
associated with a unigene cluster of Tables 1-16, and conservatively modified variants 
thereof; (3) specifically hybridize under stringent hybridization conditions to a nucleic acid 
sequence, or the complement thereof of Tables 1-16 and conservatively modified variants 

20 thereof or (4) have an amino acid sequence that has greater than about 60% amino acid 

sequence identity, 65%, 70%, 75%, 80%, 85%, 90%, preferably 91%, 92%, 93%, 94%, 95%, 
96%, 97%, 98% or 99% or greater amino sequence identity, preferably over a region of over 
a region of at least about 25, 50, 100, 200, 500, 1000, or more amino acid, to an amino acid 
sequence encoded by a nucleotide sequence of or associated with a unigene cluster of Tables 

25 1-16. A polynucleotide or polypeptide sequence is typically from a mammal including, but 
not limited to, primate, e.g., human; rodent, e.g., rat, mouse, hamster; cow, pig, horse, sheep, 
or other mammal. A "prostate cancer polypeptide" and a "prostate cancer polynucleotide," 
include both naturally occurring or recombinant forms. 

A "full length" prostate cancer protein or nucleic acid refers to a prostate 

30 cancer polypeptide or polynucleotide sequence, or a variant thereof, that contains all of the 
elements normally contained in one or more naturally occurring, wild type prostate cancer 

10 
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polynucleotide or polypeptide sequences. For example, a full length prostate cancer nucleic 
acid will typically comprise all of the exons that encode for the full length, naturally ocuning 
protein. The "full length" may be prior to, or after, various stages of post-translation 
processing or splicing, including alternative splicing. 
5 "Biological sample" as used herein is a sample of biological tissue or fluid that 

contains nucleic acids or polypeptides, e.g., of a prostate cancer protein, polynucleotide or 
transcript. Such samples include, but are not limited to, tissue isolated from primates, e.g., 
humans, or rodents, e.g., mice, and rats. Biological samples may also include sections of 
tissues such as biopsy and autopsy samples, frozen sections taken for histologic purposes, 

10 blood, plasma, serum, sputum, stool, tears, mucus, hair, skin, etc. Biological samples also 
include explants and primary and/or transformed cell cultures derived from patient tissues. A 
biological sample is typically obtained from a eukaryotic organism, most preferably a 
mammal such as a primate e.g., chimpanzee or human; cow; dog; cat; a rodent, e.g., guinea 
pig, rat, mouse; rabbit; or a bird; reptile; or fish. 

15 "Providing a biological sample" means to obtain a biological sample for use in 

methods described in this invention. Most often, this will be done by removing a sample of 
cells from an animal, but can also be accomplished by using previously isolated cells (e.g., 
isolated by another person, at another time, and/or for another purpose), or by performing the 
methods of the invention in vivo. Archival tissues, having treatment or outcome history, will 

20 be particularly useful. 

The terms "identical" or percent "identity " in the context of two or more 
nucleic acids or polypeptide sequences, refer to two or more sequences or subsequences that 
are the same or have a specified percentage of amino acid residues or nucleotides that are the 
same (i.e., about 60% identity, preferably 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 

25 95%, 96%, 97%, 98%, 99%, or higher identity over a specified region, when compared and 
aligned for maximum correspondence over a comparison window or designated region) as 
measured using a BLAST or BLAST 2.0 sequence comparison algorithms with default 
parameters described below, or by manual alignment and visual inspection (see, e.g., NCBI 
web site http://www.ncbi.nlm.nih.gov/BLAST/ or the like). Such sequences are then said to 

30 be "substantially identical." This definition also refers to, or may be applied to, the 

compliment of a test sequence. The definition also includes sequences that have deletions 

11 
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and/or additions, as well as those that have substitutions, as well as naturally occurring, e.g., 
polymorphic or allelic variants, and man-made variants. As described below, the preferred 
algorithms can account for gaps and the like. Preferably, identity exists over a region that is 
at least about 25 amino acids or nucleotides in length, or more preferably over a region that is 
5 50-100 amino acids or nucleotides in length. 

For sequence comparison, typically one sequence acts as a reference sequence, 
to which test sequences are compared. When using a sequence comparison algorithm, test 
and reference sequences are entered into a computer, subsequence coordinates are designated, 
if necessary, and sequence algorithm program parameters are designated. Preferably, default 

10 program parameters can be used, or alternative parameters can be designated. The sequence 
comparison algorithm then calculates the percent sequence identities for the test sequences 
relative to the reference sequence, based on the program parameters. 

A "comparison window", as used herein, includes reference to a segment of 
one of the number of contiguous positions selected from the group consisting typically of 

15 from 20 to 600, usually about 50 to about 200, more usually about 100 to about 150 in which 
a sequence may be compared to a reference sequence of the same number of contiguous 
positions after the two sequences are optimally aligned. Methods of alignment of sequences 
for comparison are well-known in the art. Optimal alignment of sequences for comparison 
can be conducted, e.g., by the local homology algorithm of Smith & Waterman, Adv. Appl. 

20 Math. 2:482 (198 1), by the homology alignment algorithm of Needleman & Wunsch, J. Mol. 
Biol 48:443 (1970), by the search for similarity method of Pearson & Lipman, Proc. Nat'l 
Acad, Sci. USA 85:2444 (1988), by computerized implementations of these algorithms 
(GAP, BESTFTT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, 
Genetics Computer Group, 575 Science Dr., Madison, WI), or by manual alignment and 

25 visual inspection (see, e.g., Current Protocols in Molecular Biology (Ausubel et al., eds. 
1995 supplement)). 

Preferred examples of algorithms that are suitable for determining percent 
sequence identity and sequence similarity include the BLAST and BLAST 2.0 algorithms, 
which are described in Altschul et al., Nuc. Acids Res. 25:3389-3402 (1977) and Altschul et 

30 al, J. Mol. Biol. 215:403-410 (1990). BLAST and BLAST 2.0 are used, with the parameters 
described herein, to determine percent sequence identity for the nucleic acids and proteins of 
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the invention. Software for performing BLAST analyses is publicly available through the 
National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/). This 
algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short 
words of length W in the query sequence, which either match or satisfy some positive-valued 
5 threshold score T when aligned with a word of the same length in a database sequence. T is 
referred to as the neighborhood word score threshold (Altschul et al, supra). These initial 
neighborhood word hits act as seeds for initiating searches to find longer HSPs containing 
them. The word hits are extended in both directions along each sequence for as far as the 
cumulative alignment score can be increased. Cumulative scores are calculated using, e.g., 

10 for nucleotide sequences, the parameters M (reward score for a pair of matching residues; 
always > 0) and N (penalty score for mismatching residues; always < 0). For amino acid 
sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word 
hits in each direction are halted when: the cumulative alignment score falls off by the 
quantity X from its maximum achieved value; the cumulative score goes to zero or below, 

15 due to the accumulation of one or more negative-scoring residue alignments; or the end of 
either sequence is reached. The BLAST algorithm parameters W, T, and X determine the 
sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) 
uses as defaults a wordlength (W) of 11, an expectation (E) of 10, M=5, N=-4 and a 
comparison of both strands. For amino acid sequences, the BLASTP program uses as 

20 defaults a wordlength of 3, and expectation (E) of 10, and the BLOSUM62 scoring matrix 
(see Henikoff & Henikoff, Proc. Natl Acad Sci. USA 89:10915 (1989)) alignments (B) of 
50, expectation (E) of 10, M=5, N=-4, and a comparison of both strands. 

The BLAST algorithm also performs a statistical analysis of the similarity 
between two sequences (see, e.g., Karlin & Altschul, Ptoc. Nat'L Acad. Sci. USA 90:5873- 

25 5787 (1993)). One measure of similarity provided by the BLAST algorithm is the smallest 
sum probability (P(N)), which provides an indication of the probability by which a match 
between two nucleotide or amino acid sequences would occur by chance. For example, a 
nucleic acid is considered similar to a reference sequence if the smallest sum probability in a 
comparison of the test nucleic acid to the reference nucleic acid is less than about 0.2, more 

30 preferably less than about 0.01, and most preferably less than about 0.001. Log values may 
be large negative numbers, e.g., 5, 10, 20, 30, 40, 40, 70, 90, 1 10, 150, 170, etc. 
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An indication that two nucleic acid sequences or polypeptides are substantially 
identical is that the polypeptide encoded by the first nucleic acid is immunologically cross 
reactive with the antibodies raised against the polypeptide encoded by the second nucleic 
acid, as described below. Thus, a polypeptide is typically substantially identical to a second 
5 polypeptide, e.g., where the two peptides differ only by conservative substitutions. Another 
indication that two nucleic acid sequences are substantially identical is that the two molecules 
or their complements hybridize to each other under stringent conditions, as described below. 
Yet another indication that two nucleic acid sequences are substantially identical is that the 
same primers can be used to amplify the sequences. 

10 A "host cell" is a naturally occurring cell or a transformed cell that contains an 

expression vector and supports the replication or expression of the expression vector. Host 
cells may be cultured cells, explants, cells in vivo, and the like. Host cells may be 
prokaryotic cells such as E. coli, or eukaryotic cells such as yeast, insect, amphibian, or 
mammalian cells such as CHO, HeLa, and the like {see, e.g., the American Type Culture 

15 Collection catalog or web site, www.atcc.org). 

The terms "isolated," "purified," or "biologically pure" refer to material that is 
substantially or essentially free from components that normally accompany it as found in its 
native state. Purity and homogeneity are typically determined using analytical chemistry 
techniques such as polyacrylamide gel electrophoresis or high performance liquid 

20 chromatography. A protein or nucleic acid that is the predominant species present in a 

preparation is substantially purified. In particular, an isolated nucleic acid is separated from 
some open reading frames that naturally flank the gene and encode proteins other than protein 
encoded by the gene. The term '^purified" in some embodiments denotes that a nucleic acid 
or protein gives rise to essentially one band in an electrophoretic gel. Preferably, it means 

25 that the nucleic acid or protein is at least 85% pure, more preferably at least 95% pure, and 
most preferably at least 99% pure. "Purify" or "purification" in other embodiments means 
removing at least one contaminant from the composition to be purified. In this sense, 
purification does not require that the purified compound be homogenous, e.g., 100% pure. 

The terms "polypeptide," "peptide" and "protein" are used interchangeably 

30 herein to refer to a polymer of amino acid residues. The terms apply to amino acid polymers 
in which one or more amino acid residue is an artificial chemical mimetic of a corresponding 
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naturally occurring amino acid, as well as to naturally occurring amino acid polymers, those 
containing modified residues, and non-naturally occurring amino acid polymer. 

The term "amino acid" refers to naturally occurring and synthetic amino acids, 
as well as amino acid analogs and amino acid mimetics that function similarly to the naturally 
5 occurring amino acids. Naturally occurring amino acids are those encoded by the genetic 
code, as well as those amino acids that are later modified, e.g., hydroxyproline, y- 
carboxyglutamate, and O-phosphoserine. Amino acid analogs refers to compounds that have 
the same basic chemical structure as a naturally occurring amino acid, e.g., an a carbon that is 
bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g., homoserine, 

10 norleucine, methionine sulfoxide, methionine methyl sulfonium. Such analogs may have 
modified R groups (e.g., norleucine) or modified peptide backbones, but retain the same basic 
chemical structure as a naturally occurring amino acid. Amino acid mimetics refers to 
chemical compounds that have a structure that is different from the general chemical 
structure of an amino acid, but that functions similarly to a naturally occurring amino acid. 

15 Amino acids may be referred to herein by either their commonly known three 

letter symbols or by the one-letter symbols recommended by the IUPAC-IUB Biochemical 
Nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly 
accepted single-letter codes. 

"Conservatively modified variants" applies to both amino acid and nucleic 

20 acid sequences. With respect to particular nucleic acid sequences, conservatively modified 
variants refers to those nucleic acids which encode identical or essentially identical amino 
acid sequences, or where the nucleic acid does not encode an amino acid sequence, to 
essentially identical or associated, e.g., naturally contiguous, sequences. Because of the 
degeneracy of the genetic code, a large number of functionally identical nucleic acids encode 

25 most proteins. For instance, the codons GCA, GCC, GCG and GCU all encode the amino 
acid alanine. Thus, at every position where an alanine is specified by a codon, the codon can 
be altered to another of the corresponding codons described without altering the encoded 
polypeptide. Such nucleic acid variations are "silent variations," which are one species of 
conservatively modified variations. Every nucleic acid sequence herein which encodes a 

30 polypeptide also describes silent variations of the nucleic acid. One of skill will recognize 
that in certain contexts each codon in a nucleic acid (except AUG, which is ordinarily the 
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only codon for methionine, and TGG, which is ordinarily the only codon for tryptophan) can 
be modified to yield a functionally identical molecule. Accordingly, often silent variations of 
a nucleic acid which encodes a polypeptide is implicit in a described sequence with respect to 
the expression product, but not with respect to actual probe sequences. 
5 As to amino acid sequences, one of skill will recognize that individual 

substitutions, deletions or additions to a nucleic acid, peptide, polypeptide, or protein 
sequence which alters, adds or deletes a single amino acid or a small percentage of amino 
acids in the encoded sequence is a "conservatively modified variant" where the alteration 
results in the substitution of an amino acid with a chemically similar amino acid. 

10 Conservative substitution tables providing functionally similar amino acids are well known in 
the art Such conservatively modified variants are in addition to and do not exclude 
polymorphic variants, interspecies homologs, and alleles of the invention.typically 
conservative substitutions for one another 1) Alanine (A), Glycine (G); 2) Aspartic acid (D), 
Glutamic acid (E); 3) Asparagine (N), Glutamine (Q); 4) Arginine (R), Lysine (K); 5) 

15 Isoleucine (I), Leucine (L), Methionine (M), Valine (V); 6) Phenylalanine (F), Tyrosine (Y), 
Tryptophan (W); 7) Serine (S), Threonine (T); and 8) Cysteine (C), Methionine (M) (see, 
e.g., Creighton, Proteins (1984)). 

Macromolecular structures such as polypeptide structures can be described in 
terms of various levels of organization. For a general discussion of this organization, see, 

20 e.g. t Alberts et al. , Molecular Biology of the Cell (3 1 * e&, 1994) and Cantor & Schimmel, 
Biophysical Chemistry Part I: The Conformation of Biological Macromolecules (1980). 
f Trimary structure" refers to the amino acid sequence of a particular peptide. "Secondary 
structure" refers to locally ordered, three dimensional structures within a polypeptide. These 
structures are commonly known as domains. Domains are portions of a polypeptide that 

25 often form a compact unit of the polypeptide and are typically 25 to approximately 500 
amino acids long. Typical domains are made up of sections of lesser organization such as 
stretches of fJ-sheet and a-helices. 'Tertiary structure" refers to the complete three 
dimensional structure of a polypeptide monomer. "Quaternary structure" refers to the three 
dimensional structure formed, usually by the noncovalent association of independent tertiary 

30 units. Anisotropic terms are also known as energy terms. 
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"Nucleic acid" or "oligonucleotide" or "polynucleotide" or grammatical 
equivalents used herein means at least two nucleotides covalently linked together. 
Oligonucleotides are typically from about 5, 6, 7, 8, 9, 10, 12, 15, 25, 30, 40, 50 or more 
nucleotides in length, up to about 100 nucleotides in length. Nucleic acids and 
5 polynucleotides are a polymers of any length, including longer lengths, e.g., 200, 300, 500, 
1000, 2000, 3000, 5000, 7000, 10,000, etc, A nucleic acid of the present invention will 
generally contain phosphodiester bonds, although in some cases, nucleic acid analogs are 
included that may have alternate backbones, comprising, e.g., phosphoramidate, 
phosphorothioate, phosphorodithioate, or O-methylphophoroamidite linkages (see Eckstein, 

10 Oligonucleotides and Analogues: A Practical Approach, Oxford University Press); and 

peptide nucleic acid backbones and linkages. Other analog nucleic acids include those with 
positive backbones; non-ionic backbones, and non-ribose backbones, including those 
described in U.S. Patent Nos. 5,235,033 and 5,034,506, and Chapters 6 and 7, ASC 
Symposium Series 580, Carbohydrate Modifications inAntisense Research, Sanghui & 

15 Cook, eds.. Nucleic acids containing one or more carbocyclic sugars are also included within 
one definition of nucleic acids. Modifications of the ribose-phosphate backbone may be done 
for a variety of reasons, e.g. to increase the stability and half-life of such molecules in 
physiological environments or as probes on a biochip. Mixtures of naturally occurring 
nucleic acids and analogs can be made; alternatively, mixtures of different nucleic acid 

20 analogs, and mixtures of naturally occurring nucleic acids and analogs may be made. 

A variety of references disclose such nucleic acid analogs, including, for 
example, phosphoramidate (Beaucage et aL, Tetrahedron 49(10): 1925 (1993) and references 
therein; Letsinger, J. Org. Chem. 35:3800 (1970); Sprinzl et aL, Eur. J. Biochem. 81:579 
(1977); Letsinger et al., Nucl. Acids Res. 14:3487 (1986); Sawai et al, Chem. Lett. 805 

25 (1984), Letsinger et al., J. Am. Chem. Soc. 1 10:4470 (1988); and Pauwels et al., Chemica 
Scripta 26:141 91986)), phosphorothioate (Mag et al., Nucleic Acids Res. 19: 1437 (1991); 
and U.S. Patent No. 5,644,048), phosphorodithioate (Briu et al., J. Am. Chem. Soc. 111:2321 
(1989), O-methylphophoroamidite linkages (see Eckstein, Oligonucleotides and Analogues: 
A Practical Approach, Oxford University Press), and peptide nucleic acid backbones and 

30 linkages (see Egholm, J. Am. Chem. Soc. 1 14: 1895 (1992); Meier et al., Chem. Int. Ed. Engl. 
31:1008 (1992); Nielsen, Nature, 365:566 (1993); Carlsson et al., Nature 380:207 (1996), all 
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of which are incoiporated by reference). Other analog nucleic acids include those with 
positive backbones (Denpcy et al., Proc. Natl. Acad Sci. USA 92:6097 (1995); non-ionic 
backbones (U.S. Patent Nos. 5,386,023, 5,637,684, 5,602,240, 5,216,141 and 4,469,863; 
Kiedrowshi et al., Angew. Chem. Intl. Ed. English 30:423 (1991); Letsinger et al., J. Am. 
5 Chem. Soc. 110:4470 (1988); Letsinger et al., Nucleoside & Nucleotide 13: 1597 (1994); 
Chapters 2 and 3, ASC Symposium Series 580, "Carbohydrate Modifications in Antisense 
Research", Ed. Y.S. Sanghui and P. Dan Cook; Mesmaeker et al., Bioorganic & Medicinal 
Chem. Lett 4:395 (1994); Jeffs et al., J. Biomolecular NMR 34:17 (1994); Tetrahedron Lett. 
37:743 (1996)) and non-ribose backbones, including those described in U.S. Patent Nos. 

10 5,235,033 and 5,034,506, and Chapters 6 and 7, ASC Symposium Series 580, "Carbohydrate 
Modifications in Antisense Research", Ed. Y.S. Sanghui and P. Dan Cook. Nucleic acids 
containing one or more carbocyclic sugars are also included within one definition of nucleic 
acids (see Jenkins et al., Chem. Soc. Rev. (1995) pp 169-176). Several nucleic acid analogs 
are described in Rawls, C & E News June 2, 1997 page 35. All of these references are hereby 

15 expressly incorporated by reference. 

Particularly preferred are peptide nucleic acids (PNA) which includes peptide 
nucleic acid analogs. These backbones are substantially non-ionic under neutral conditions, in 
contrast to the highly charged phosphodiester backbone of naturally occurring nucleic acids. 
This results in two advantages. First, the PNA backbone exhibits improved hybridization 

20 kinetics. PNAs have larger changes in the melting temperature (Tm) for mismatched versus 
perfectly matched basepairs. DNA and RNA typically exhibit a 2-4°C drop in T m for an 
internal mismatch. With the non-ionic PNA backbone, the drop is closer to 7-9°C. Similarly, 
due to their non-ionic nature, hybridization of the bases attached to these backbones is 
relatively insensitive to salt concentration. In addition, PNAs are not degraded by cellular 

25 enzymes, and thus can be more stable. 

The nucleic acids may be single stranded or double stranded, as specified, or 
contain portions of both double stranded or single stranded sequence. As will be appreciated 
by those in the art, the depiction of a single strand also defines the sequence of the 
complementary strand; thus the sequences described herein also provide the complement of 

30 the sequence. The nucleic acid may be DNA, both genomic and cDNA, RNA or a hybrid, 
where the nucleic acid may contain combinations of deoxyribo- and ribo-nucleotides, and 
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combinations of bases, including uracil, adenine, thymine, cytosine, guanine, inosine, 
xanthine hypoxanthine, isocytosine, isoguanine, etc. 'Transcript" typically refers to a 
naturally occurring RNA, e.g., a pre-mRNA, hnRNA, or mRNA. As used herein, the term 
"nucleoside" includes nucleotides and nucleoside and nucleotide analogs, and modified 
5 nucleosides such as amino modified nucleosides. In addition, "nucleoside" includes non- 
naturally occurring analog structures. Thus, e.g. the individual units of a peptide nucleic 
acid, each containing a base, are referred to herein as a nucleoside. 

A "label" or a "detectable moiety" is a composition detectable by 
spectroscopic, photochemical, biochemical, immunochemical, chemical, or other physical 

10 means. For example, useful labels include fluorescent dyes, electron-dense reagents, enzymes 
(e.g., as commonly used in an ELLSA), biotitn, digoxigenin, or haptens and proteins or other 
entities which can be made detectable, e.g., by incorporating a radiolabel into the peptide or 
used to detect antibodies specifically reactive with the peptide. The radioisotope may be, for 
example, 3H, 14C, 32P, 35S, or 1251. In some cases, particularly using antibodies against the 

15 proteins of the invention, the radioisotopes are used as toxic moieties, as described below. 
The labels may be incorporated into the prostate cancer nucleic acids, proteins and antibodies 
at any position. Any method known in the art for conjugating the antibody to the label may 
be employed, including those methods described by Hunter et al., Nature , 144:945 (1962); 
David et al., Biochemistry. 13:1014 (1974); Pain et al., J. Immunol. Meth. . 40:219 (1981); 

20 and Nygren, J. Histochem. and Cvtochem., 30:407 (1982). The lifetime of radiolabeled 
peptides or radiolabeled antibody compositions may extended by the addition of substances 
that stablize the radiolabeled peptide or antibody and protect it from degradation. Any 
substance or combination of substances that stablize the radiolabeled peptide or antibody may 
be used including those substances disclosed in US Patent No. 5,961,955. 

25 An "effector" or "effector moiety" or "effector component" is a molecule that 

is bound (or linked, or conjugated), either covalently, through a linker or a chemical bond, or 
noncovalently, through ionic, van der Waals, electrostatic, or hydrogen bonds, to an antibody. 
The "effector" can be a variety of molecules including, e.g., detection moieties including 
radioactive compounds, fluorescent compounds, an enzyme or substrate, tags such as epitope 

30 tags, a toxin; activatable moieties, a chemotherapeutic agent; a lipase; an antibiotic; or a 
radioisotope emitting "hard" e.g., beta radiation. 
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A "labeled nucleic acid probe or oligonucleotide" is one that is bound, either 
covalently, through a linker or a chemical bond, or noncovalently, through ionic, van der 
Waals, electrostatic, or hydrogen bonds to a label such that the presence of the probe may be 
detected by detecting the presence of the label bound to the probe. Alternatively, method 
5 using high affinity interactions may achieve the same results where one of a pair of binding 
partners binds to the other, e.g., biotin, streptavidin. 

As used herein a "nucleic acid probe or oligonucleotide" is defined as a 
nucleic acid capable of binding to a target nucleic acid of complementary sequence through 
one or more types of chemical bonds, usually through complementary base pairing, usually 

10 through hydrogen bond formation. As used herein, a probe may include natural (i.e., A, G, C, 
or T) or modified bases (7-deazaguanosine, inosine, etc.). In addition, the bases in a probe 
may be joined by a linkage other than a phosphodiester bond, so long as it does not 
functionally interfere with hybridization. Thus, e.g., probes may be peptide nucleic acids in 
which the constituent bases are joined by peptide bonds rather than phosphodiester linkages. 

15 It will be understood by one of skill in the art that probes may bind target sequences lacking 
complete complementarity with the probe sequence depending upon the stringency of the 
hybridization conditions. The probes are preferably directly labeled as with isotopes, 
chromophobes, lumiphores, chromogens, or indirectly labeled such as with biotin to which a 
streptavidin complex may later bind. By assaying for the presence or absence of the probe, 

20 one can detect the presence or absence of the select sequence or subsequence. Diagnosis or 
prognosis may be based at the genomic level, or at the level of RNA or protein expression. 

The term "recombinant" when used with reference, e.g., to a cell, or nucleic 
acid, protein, or vector, indicates that the cell, nucleic acid, protein or vector, has been 
modified by the introduction of a heterologous nucleic acid or protein or the alteration of a 

25 native nucleic acid or protein, or that the cell is derived from a cell so modified. Thus, e.g., 
recombinant cells express genes that are not found within the native (non-recombinant) form 
of the cell or express native genes that are otherwise abnormally expressed, under expressed 
or not expressed at all. By the term "recombinant nucleic acid" herein is meant nucleic acid, 
originally formed in vitro, in general, by the manipulation of nucleic acid, e.g., using 

30 polymerases and endonucleases, in a form not noimally found in nature. In this manner, 

operably linkage of different sequences is achieved Thus an isolated nucleic acid, in a linear 
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form, or an expression vector formed in vitro by ligating DNA molecules that are not 
normally joined, are both considered recombinant for the purposes of this invention. It is 
understood that once a recombinant nucleic acid is made and reintroduced into a host cell or 
organism, it will replicate non-recombinantly, i.e., using the in vivo cellular machinery of the 
5 host cell rather than in vitro manipulations; however, such nucleic acids, once produced 
recombinantly, although subsequently replicated non-recombinantly, are still considered 
recombinant for the purposes of the invention. Similarly, a "recombinant protein" is a protein 
made using recombinant techniques, i.e., through the expression of a recombinant nucleic 
acid as depicted above. 

10 The term "heterologous" when used with reference to portions of a nucleic 

acid indicates that the nucleic acid comprises two or more subsequences that are not normally 
found in the same relationship to each other in nature. For instance, the nucleic acid is 
typically recombinantly produced, having two or more sequences, e.g., from unrelated genes 
arranged to make a new functional nucleic acid, e.g., a promoter from one source and a 

15 coding region from another source. Similarly, a heterologous protein will often refer to two 
or more subsequences that are not found in the same relationship to each other in nature (e.g., 
a fusion protein). 

A promoter" is defined as an array of nucleic acid control sequences that 
direct transcription of a nucleic acid. As used herein, a promoter includes necessary nucleic 

20 acid sequences near the start site of transcription, such as, in the case of a polymerase II type 
promoter, a TATA element. A promoter also optionally includes distal enhancer or repressor 
elements, which can be located as much as several thousand base pairs from the start site of 
transcription. A "constitutive" promoter is a promoter that is active under most 
environmental and developmental conditions. An "inducible" promoter is a promoter that is 

25 active under environmental or developmental regulation. The term "operably linked" refers 
to a functional linkage between a nucleic acid expression control sequence (such as a 
promoter, or array of transcription factor binding sites) and a second nucleic acid sequence, 
wherein the expression control sequence directs transcription of the nucleic acid 
corresponding to the second sequence. 

30 An "expression vector" is a nucleic acid construct, generated recombinantly or 

synthetically, with a series of specified nucleic acid elements that permit transcription of a 
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particular nucleic acid in a host cell. The expression vector can be part of a plasmid, virus, or 
nucleic acid fragment. Typically, the expression vector includes a nucleic acid to be 
transcribed operably linked to a promoter. 

The phrase "selectively (or specifically) hybridizes to" refers to the binding, 
5 duplexing, or hybridizing of a molecule only to a particular nucleotide sequence that is 
determinative of the presence of the nucleotide sequence, in a heterogeneous population of 
nucleic acids and other biologies (e.g., total cellular or library DNA or RNA). Similarly, the 
phrase "specifically (or selectively) binds" to an antibody or "specifically (or selectively) 
immunoreactive with," when referring to a protein or peptide, refers to a binding reaction that 
10 is determinative of the presence of the protein, in a heterogeneous population of proteins and 
other biologies. Thus, under designated immunoassay or nucleic acid hybridization 
conditions, the specified antibodies or nucleic acid probes bind to a particular protein 
nucleotide sequences at least two times the background and more typically more than 10 to 
100 times background. 

15 Specific binding to an antibody under such conditions requires an antibody 

that is selected for its specificity for a particular protein. For example, polyclonal antibodies 
raised to a particular protein, polymorphic variants, alleles, orthologs, and conservatively 
modified variants, or splice variants, or portions thereof , can be selected to obtain only those 
polyclonal antibodies that are specifically immunoreactive with the desired prostact cancer 

20 protein and not with other proteins. This selection may be achieved by subtracting out 

antibodies that cross-react with other molecules. A variety of immunoassay formats may be 
used to select antibodies specifically immunoreactive with a particular protein. For example, 
solid-phase ELBA immunoassays are routinely used to select antibodies specifically 
immunoreactive with a protein (see, e.g., Harlow & Lane, Antibodies, A Laboratory Manual 

25 (1988) for a description of immunoassay formats and conditions that can be used to 
determine specific immunoreactivity). 

The phrase "stringent hybridization conditions" refers to conditions under 
which a probe will hybridize to its target subsequence, typically in a complex mixture of 
nucleic acids, but to no other sequences. Stringent conditions are sequence-dependent and 

30 will be different in different circumstances. Longer sequences hybridize specifically at 
higher temperatures. An extensive guide to the hybridization of nucleic acids is found in 
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Tijssen, Techniques in Biochemistry and Molecular Biology-Hybridization with Nucleic 
Probes, "Overview of principles of hybridization and the strategy of nucleic acid assays" 
(1993). Generally, stringent conditions are selected to be about 5-10°C lower than the 
thermal melting point (Tm) for the specific sequence at a defined ionic strength pH. The T m is 
5 the temperature (under defined ionic strength, pH, and nucleic concentration) at which 50% 
of the probes complementary to the target hybridize to the target sequence at equilibrium (as 
the target sequences are present in excess, at T m , 50% of the probes are occupied at 
equilibrium). Stringent conditions will be those in which the salt concentration is less than 
about 1.0 M sodium ion, typically about 0.01 to L0 M sodium ion concentration (or other 

10 salts) at pH 7.0 to 8.3 and the temperature is at least about 30°C for short probes (e.g., 10 to 
50 nucleotides) and at least about 60°C for long probes (e.g., greater than 50 nucleotides). 
Stringent conditions may also be achieved with the addition of destabilizing agents such as 
formamide. For selective or specific hybridization, a positive signal is at least two times 
background* preferably 10 times background hybridization. Exemplary stringent 

15 hybridization conditions can be as following: 50% formamide, 5x SSC, and 1% SDS, 
incubating at 42°C, or, 5x SSC, 1% SDS, incubating at 65°C, with wash in 0.2x SSC, and 
0.1% SDS at 65°C. For PCR, a temperature of about 36°C is typical for low stringency 
amplification, although annealing temperatures may vary between about 32°C and 48°C 
depending on primer length. For high stringency PCR amplification, a temperature of about 

20 62°C is typical, although high stringency annealing temperatures can range from about 50°C 
to about 65°C, depending on the primer length and specificity. Typical cycle conditions for 
both high and low stringency amplifications include a denaturation phase of 90°C - 95°C for 
30 sec - 2 min., an annealing phase lasting 30 sec. - 2 min„ and an extension phase of about 
72°C for 1 - 2 min. Protocols and guidelines for low and high stringency amplification 

25 reactions are provided, e.g., in Innis et al (1990) PCR Protocols, A Guide to Methods and , 
Applications, Academic Press, Inc. N.Y.). 

Nucleic acids that do not hybridize to each other under stringent conditions are 
still substantially identical if the polypeptides which they encode are substantially identical. 
This occurs, e.g., when a copy of a nucleic acid is created using the maximum codon 

30 degeneracy permitted by the genetic code. In such cases, the nucleic acids typically hybridize 
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under moderately stringent hybridization conditions. Exemplary "moderately stringent 

hybridization conditions" include a hybridization in a buffer of 40% f ormamide, 1 M NaCl, 

1% SDS at 37°C, and a wash in IX SSC at 45°C. A positive hybridization is at least twice 

background. Those of oidinary skill will readily recognize that alternative hybridization and 
5 wash conditions can be utilized to provide conditions of similar stringency. Additional 

guidelines for determining hybridization parameters are provided in numerous reference, e.g., 

and Current Protocols in Molecular Biology, ed. Ausubel, et cd. 

The phrase "functional effects" in the context of assays for testing compounds 

that modulate activity of a prostate cancer protein includes the determination of a parameter 
10 that is indirectly or directly under the influence of the prostate cancer protein or nucleic acid, 

e.g., a functional, physical, or chemical effect, such as the ability to decrease prostate cancer. 

It includes ligand binding activity; cell growth on soft agar; anchorage dependence; contact 

inhibition and density limitation of growth; cellular proliferation; cellular transformation; 

growth factor or serum dependence; tumor specific marker levels; invasiveness into Matrigel; 
15 tumor growth and metastasis in vivo; mRNA and protein expression in cells undergoing 

metastasis, and other characteristics of prostate cancer cells. "Functional effects" include in 

vitro, in vivo, and ex vivo activities. 

» 

By "determining the functional effect" is meant assaying for a compound that 
increases or decreases a parameter that is indirectly or directly under the influence of a 

20 prostate cancer protein sequence, e.g„ functional, enzymatic, physical and chemical effects. 
Such functional effects can be measured by any means known to those skilled in the art, e.g., 
changes in spectroscopic characteristics (e.g., fluorescence, absorbance, refractive index), 
hydrodynamic (e.g., shape), chromatographic, or solubility properties for the protein, 
measuring inducible markers or transcriptional activation of the prostate cancer protein; 

25 measuring binding activity or binding assays, e.g. binding to antibodies or other ligands, and 
measuring cellular proliferation. Determination of the functional effect of a compound on 
prostate cancer can also be performed using prostate cancer assays known to those of skill in 
the art such as an in vitro assays, e.g., cell growth on soft agar; anchorage dependence; 
contact inhibition and density limitation of growth; cellular proliferation; cellular 

30 transformation; growth factor or serum dependence; tumor specific marker levels; 
invasiveness into Matrigel; tumor growth and metastasis in vivo; mRNA and protein 
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expression in cells undergoing metastasis, and other characteristics of prostate cancer cells. 
The functional effects can be evaluated by many means known to those skilled in the art, e.g., 
microscopy for quantitative or qualitative measures of alterations in morphological features, 
measurement of changes in RNA or protein levels for prostate cancer-associated sequences, 
5 measurement of RNA stability, identification of downstream or reporter gene expression 
(CAT, luciferase, P-gai, GFP and the like), e.g., via chemiluminescence, fluorescence, 
colorimetric reactions, antibody binding, inducible markers, and ligand binding assays. 

"Inhibitors", "activators", and "modulators" of prostate cancer polynucleotide 
and polypeptide sequences are used to refer to activating, inhibitory, or modulating molecules 

10 or compounds identified using in vitro and in vivo assays of prostate cancer polynucleotide 
and polypeptide sequences. Inhibitors are compounds that, e.g., bind to, partially or totally 
block activity, decrease, prevent, delay activation, inactivate, desensitize, or down regulate 
the activity or expression of prostate cancer proteins, e.g., antagonists. Antisense nucleic 
acids may seem to inhibit expression and subsequent function of the protein. "Activators" 

15 are compounds that increase, open, activate, facilitate, enhance activation, sensitize, agonize, 
or up regulate prostate cancer protein activity. Inhibitors, activators, or modulators also 
include genetically modified versions of prostate cancer proteins, e.g., versions with altered 
activity, as well as naturally occurring and synthetic ligands, antagonists, agonists, antibodies, 
small chemical molecules and the like. Such assays for inhibitors and activators include, e.g., 

20 expressing the prostate cancer protein in vitro, in cells, or cell membranes, applying putative 
modulator compounds, and then determining the functional effects on activity, as described 
above. Activators and inhibitors of prostate cancer can also be identified by incubating 
prostate cancer cells with the test compound and determining increases or decreases in the 
expression of 1 or more prostate cancer proteins, e.g., If 2, 3, 4, 5, 10, 15, 20, 25, 30, 40, 50 

25 or more prostate cancer proteins, such as prostate cancer proteins encoded by the sequences 
set out in Tables 1-16. 

Samples or assays comprising prostate cancer proteins that are treated with a 
potential activator, inhibitor, or modulator are compared to control samples without the 
inhibitor, activator, or modulator to examine the extent of inhibition. Control samples 

30 (untreated with inhibitors) are assigned a relative protein activity value of 100%. Inhibition 
of a polypeptide is achieved when the activity value relative to the control is about 80%, 
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preferably 50%, more preferably 25-0%. Activation of a prostate cancer polypeptide is 
achieved when the activity value relative to the control (untreated with activators) is 110%, 
more preferably 150%, more preferably 200-500% (i.e., two to five fold higher relative to the 
control), more preferably 1000-3000% higher. 
5 The phrase "changes in cell growth" refers to any change in cell growth and 

proliferation characteristics in vitro or in vivo, such as formation of foci, anchorage 
independence, semi-solid or soft agar growth, changes in contact inhibition and density 
limitation of growth, loss of growth factor or serum requirements, changes in cell 
morphology, gaining or losing immortalization, gaining or losing tumor specific markers, 

10 ability to form or suppress tumors when injected into suitable animal hosts, and/or 

immortalization of the cell. See, e.g., Freshney, Culture of Animal Cells a Manual of Basic 
Technique pp. 231-241 ^ ed. 1994). 

'Tumor cell" refers to precancerous, cancerous, and normal cells in a tumor. 
"Cancer cells," "transformed" cells or "transformation" in tissue culture, refers 

15 to spontaneous or induced phenotypic changes that do not necessarily involve the uptake of 
new genetic material. Although transformation can arise from infection with a transforming 
virus and incorporation of new genomic DNA, or uptake of exogenous DNA, it can also arise 
spontaneously or following exposure to a carcinogen, thereby mutating an endogenous gene. 
Transformation is associated with phenotypic changes, such as immortalization of cells, 

20 aberrant growth control, nonmorphological changes, and/or malignancy (see, Freshney, 
Culture of Animal Cells a Manual of Basic Technique (3 rd ed. 1994)). 

"Antibody" refers to a polypeptide comprising a framework region from an 
immunoglobulin gene or fragments thereof that specifically binds and recognizes an antigen. 
The recognized immunoglobulin genes include the kappa, lambda, alpha, ga mm a, delta, 

25 epsilon, and mu constant region genes, as well as the myriad immunoglobulin variable region 
genes. light chains are classified as either kappa or lambda. Heavy chains are classified as 
gamma, mu, alpha, delta, or epsilon, which in turn define the immunoglobulin classes, IgG, 
IgM, IgA, IgD and IgE, respectively. Typically, the antigen-binding region of an antibody or 
its functional equivalent will be most critical in specificity and affinity of binding. See Paul, 

30 Fundamental Immunology. 
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An exemplary immunoglobulin (antibody) structural unit comprises a 
tetramer. Each tetramer is composed of two identical pairs of polypeptide chains, each pair 
having one "light" (about 25 fcD) and one "heavy" chain (about 50-70 kD). The N-terminus 
of each chain defines a variable region of about 100 to 1 10 or more amino acids primarily 
5 responsible for antigen recognition. The terms variable light chain (V l) and variable heavy 
chain (V H ) refer to these light and heavy chains respectively. 

Antibodies exist, e.g., as intact immunoglobulins or as a number of well- 
characterized fragments produced by digestion with various peptidases. Thus, e.g., pepsin 
digests an antibody below the disulfide linkages in the hinge region to produce F(ab)'2, a 

10 dimer of Fab which itself is a light chain joined to V h -Ch1 by a disulfide bond. The F(ab)' 2 
may be reduced under mild conditions to break the disulfide linkage in the hinge region, 
thereby converting the F(ab)*2 dimer into an Fab' monomer. The Fab' monomer is 
essentially Fab with part of the hinge region (see Fundamental Immunology (Paul ed., 3d ed. 
1993). While various antibody fragments are defined in terms of the digestion of an intact 

15 antibody, one of skill will appreciate that such fragments may be synthesized de novo either 
chemically or by using recombinant DNA methodology. Thus, the term antibody, as used 
herein, also includes antibody fragments either produced by the modification of whole 
antibodies, or those synthesized de novo using recombinant DNA methodologies (e.g., single 
chain Fv) or those identified using phage display libraries (see, e.g. t McCafferty et al„ Nature 

20 348:552-554 (1990)) 

For preparation of antibodies, e.g., recombinant, monoclonal, or polyclonal 
antibodies, many technique known in the art can be used (see, e.g., Kohler & Milstein, 
Nature 256:495-497 (1975); Kozbor et al, Immunology Today 4:72 (1983); Cole et aU pp. 
77-96 in Monoclonal Antibodies and Cancer Therapy (1985); Coligan, Current Protocols in 

25 Immunology (199 1); Harlow & Lane, Antibodies, A Laboratory Manual (1988); and Goding, 
Monoclonal Antibodies: Principles and Practice (2d ed. 1986)). Techniques for the 
production of single chain antibodies (U.S. Patent 4,946,778) can be adapted to produce 
antibodies to polypeptides of this invention. Also, transgenic mice, or other organisms such 
as other mammals, may be used to express humanized antibodies. Alternatively, phage 

30 display technology can be used to identify antibodies and heteromeric Fab fragments that 
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specifically bind to selected antigens (see, e.g., McCafferty et al, Nature 348:552-554 
(1990); Marks et al y Biotechnology 10:779-783 (1992)). 

A "chimeric antibody" is an antibody molecule in which (a) the constant 
region, or a portion thereof, is altered, replaced or exchanged so that the antigen binding site 
5 (variable region) is linked to a constant region of a different or altered class, effector function 
and/or species, or an entirely different molecule which confers new properties to the chimeric 
antibody, e.g., an enzyme, toxin, hormone, growth factor, drug, etc.; or (b) the variable 
region, or a portion thereof, is altered, replaced or exchanged with a variable region having a 
different or altered antigen specificity. 

10 

Identification of prostate cancer-associated sequences 

In one aspect, the expression levels of genes are determined in different 
patient samples for which diagnosis information is desired, to provide expression profiles. 
An expression profile of a particular sample is essentially a 'fingerprint*' of the state of the 

15 sample; while two states may have any particular gene similarly expressed, the evaluation of 
a number of genes simultaneously allows the generation of a gene expression profile that is 
characteristic of the state of the cell. That is, normal tissue (e.g., normal prostate or other 
tissue) may be distinguished from cancerous or metastatic cancerous tissue of the prostate, or 
prostate cancer tissue or metastatic prostate cancerous tissue can be compared with tissue 

20 samples of prostate and other tissues from surviving cancer patients. By comparing 

expression profiles of tissue in known different prostate cancer states, information regarding 
which genes are important (including both up- and down-regulation of genes) in each of these 
states is obtained. 

The identification of sequences that are differentially expressed in prostate 
25 cancer versus non-prostate cancer tissue allows the use of this infonnation in a number of 

ways. For example, a particular treatment regime may be evaluated: does a chemotherapeutic 
drug act to down-regulate prostate cancer, and thus tumor growth or recurrence, in a 
particular patient Similarly, diagnosis and treatment outcomes may be done or confirmed by 
comparing patient samples with the known expression profiles. Metastatic tissue can also be 
30 analyzed to determine the stage of prostate cancer in the tissue. Furthermore, these gene 
expression profiles (or individual genes) allow screening of drug candidates with an eye to 
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mimicking or altering a particular expression profile; e.g., screening can be done for drugs 
that suppress the prostate cancer expression profile. This may be done by making biochips 
comprising sets of the important prostate cancer genes, which can then be used in these 
screens. These methods can also be done on the protein basis; that is, protein expression 
5 levels of the prostate cancer proteins can be evaluated for diagnostic purposes or to screen 
candidate agents. In addition, the prostate cancer nucleic acid sequences can be administered 
for gene therapy purposes, including the administration of antisense nucleic acids, or the 
prostate cancer proteins (including antibodies and other modulators thereof) administered as 
therapeutic drugs. 

10 Thus the present invention provides nucleic acid and protein sequences that 

are differentially expressed in prostate cancer, herein termed "prostate cancer sequences." As 
outlined below, prostate cancer sequences include those that are up-regulated (i.e., expressed 
at a higher level) in prostate cancer, as well as those that are down-regulated (i.e., expressed 
at a lower level). In a preferred embodiment, the prostate cancer sequences are from humans; 

15 however, as will be appreciated by those in the art, prostate cancer sequences from other 
organisms may be useful in animal models of disease and drug evaluation; thus, other 
prostate cancer sequences are provided, from vertebrates, including mammals, including 
rodents (rats, mice, hamsters, guinea pigs, etc.), primates, farm animals (including sheep, 
goats, pigs, cows, horses, etc.) and pets, e.g., (dogs, cats, etc.). Prostate cancer sequences 

20 from other organisms may be obtained using the techniques outlined below. 

Prostate cancer sequences can include both nucleic acid and amino acid 
sequences. As will be appreciated by those in the art and is more fully outlined below, 
prostate cancer nucleic acid sequences are useful in a variety of applications, including 
diagnostic applications, which will detect naturally occurring nucleic acids, as well as 

25 screening applications; e.g., biochips comprising nucleic acid probes or PCR microtiter plates 
with selected probes to the prostate cancer sequences can be generated. 

A prostate cancer sequence can be initially identified by substantial nucleic 
acid and/or amino acid sequence homology to the prostate cancer sequences outlined herein. 
Such homology can be based upon the overall nucleic acid.or amino acid sequence, and is 

30 generally determined as outlined below, using either homology programs or hybridization 
conditions. 
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For identifying prostate cancer-associated sequences, the prostate cancer 
screen typically includes comparing genes identified in different tissues, e.g., normal and 
cancerous tissues, or tumor tissue samples from patients who have metastatic disease vs. non 
metastatic tissue. Other suitable tissue comparisons include comparing prostate cancer 
5 samples with metastatic cancer samples from other cancers, such as lung, breast, 

gastrointestinal cancers, ovarian, etc. Samples of different stages of prostate cancer, e.g., 
survivor tissue, drug resistant states, and tissue undergoing metastasis, are applied to biochips 
comprising nucleic acid probes. The samples are first microdissected, if applicable, and 
treated as is known in the art for the preparation of mRNA. Suitable biochips are 

10 commercially available, e.g. from Affymetrix. Gene expression profiles as described herein 
are generated and the data analyzed. 

In one embodiment, the genes showing changes in expression as between 
normal and disease states are compared to genes expressed in other normal tissues, preferably 
normal prostate, but also including, and not limited to lung, heart, brain, liver, breast, kidney, 

15 muscle, colon, small intestine, large intestine, spleen, bone and placenta. In a preferred 
embodiment, those genes identified during the prostate cancer screen that are expressed in 
any significant amount in other tissues are removed from the profile, although in some 
embodiments, this is not necessary. That is, when screening for drugs, it is usually preferable 
that the target be disease specific, to minimize possible side effects. 

20 In a preferred embodiment, prostate cancer sequences are those that are up- 

regulated in prostate cancer; that is, the expression of these genes is higher in the prostate 
cancer tissue as compared to non-cancerous tissue. 'Hp-regulation" as used herein often 
means at least about a two-fold change, preferably at least about a three fold change, with at 
least about five-fold or higher being preferred. All unigene cluster identification numbers 

25 and accession numbers herein are for the GenBank sequence database and the sequences of 
the accession numbers are hereby expressly incorporated by reference. GenBank is known in 
the art, see, e.g., Benson, DA, et at, Nucleic Acids Research 26:1-7 (1998) and 
http://www.ncbi.nlm.nih.gov/. Sequences are also available in other databases, e.g., 
European Molecular Biology Laboratory (EMBL) and DNA Database of Japan (DDBJ). 

30 In another preferred embodiment, prostate cancer sequences are those that are 

down-regulated in prostate cancer, that is, the expression of these genes is lower in prostate 
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cancer tissue as compared to non-cancerous tissue (see, e.g., Tables 8, 12 and 14). 'Down- 
regulation" as used herein often means at least about a 1.5-fold change more preferrably a 
two-fold change, preferably at least about a three fold change, with at least about five-fold or 
higher being most preferred. 

5 

Informatics 

The ability to identify genes that are over or under expressed in prostate 
cancer can additionally provide high-resolution, high-sensitivity datasets which can be used 

10 in the areas of diagnostics, therapeutics, drug development, pharmacogenetics, protein 
structure, biosensor development, and other related areas. For example, the expression 
profiles can be used in diagnostic or prognostic evaluation of patients with prostate cancer. 
Or as another example, subcellular toxicological information can be generated to better direct 
drug structure and activity correlation (see Anderson, Pharmaceutical Proteomics: Targets, 

15 Mechanism, and Function, paper presented at the IBC Proteomics conference, Coronado, CA 
(June 11-12, 1998)). Subcellular toxicological information can also be utilized in a biological 
sensor device to predict the likely toxicological effect of chemical exposures and likely 
tolerable exposure thresholds (see U.S. Patent No. 5,811,231). Similar advantages accrue 
from datasets relevant to other biomolecules and bioactive agents (e.g., nucleic acids, 

20 saccharides, lipids, drugs, and the like). 

Thus, in another embodiment, the present invention provides a database that 
includes at least one set of assay data. The data contained in the database is acquired, e.g., 
using array analysis either singly or in a library format Hie database can be in substantially 
any form in which data can be maintained and transmitted, but is preferably an electronic 

25 database. The electronic database of the invention can be maintained on any electronic 
device allowing for the storage of and access to the database, such as a personal computer, 
but is preferably distributed on a wide area network, such as the World Wide Web. 

The focus of the present section on databases that include peptide sequence 
data is for clarity of illustration only. It will be apparent to those of skill in the art that similar 

30 databases can be assembled for any assay data acquired using an assay of the invention. 
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The compositions and methods for identifying and/or quantitating the relative 
and/or absolute abundance of a variety of molecular and macromolecular species from a 
biological sample undergoing prostate cancer, i.e., the identification of prostate cancer- 
associated sequences described herein, provide an abundance of information, which can be 
5 correlated with pathological conditions, predisposition to disease, drug testing, therapeutic 
monitoring, gene-disease causal linkages, identification of correlates of immunity and 
physiological status, among others. Although the data generated from the assays of the 
invention is suited for manual review and analysis, in a preferred embodiment, prior data 
processing using high-speed computers is utilized. 

10 An array of methods for indexing and retrieving biomolecular information is 

known in the art. For example, U.S. Patents 6,023,659 and 5,966,712 disclose a relational 
database system for storing biomolecular sequence information in a manner that allows 
sequences to be catalogued and searched according to one or more protein function 
hierarchies. U.S. Patent 5,953,727 discloses a relational database having sequence records 

15 containing information in a format that allows a collection of partial-length DNA sequences 
to be catalogued and searched according to association with one or more sequencing projects 
for obtaining full-length sequences from the collection of partial length sequences. U.S. 
Patent 5,706,498 discloses a gene database retrieval system for making a retrieval of a gene 
sequence similar to a sequence data item in a gene database based on the degree of similarity 

20 between a key sequence and a target sequence. U.S. Patent 5,538,897 discloses a method 
using mass spectroscopy fragmentation patterns of peptides to identify amino acid sequences 
in computer databases by comparison of predicted mass spectra with experimentally-derived 
mass spectra using a closeness-of-fit measure. U.S. Patent 5,926,818 discloses a multi- 
dimensional database comprising a functionality for multi-dimensional data analysis 

25 described as on-line analytical processing (OLAP), which entails the consolidation of 

projected and actual data according to more than one consolidation path or dimension. U.S. 
Patent 5,295,261 reports a hybrid database structure in which the fields of each database 
record are divided into two classes, navigational and informational data, with navigational 
fields stored in a hierarchical topological map which can be viewed as a tree structure or as 

30 the merger of two or more such tree structures. 
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See also Mount et al, Bioinformatics (2001); Biological Sequence Analysis: 
Probabilistic Models of Proteins and Nucleic Acids (Durbin et al, eds., 1999); 
Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins (Baxevanis & 
Oeullette eds., 1998)); Rashidi & Buehler, Bioinformatics: Basic Applications in Biological 
5 Science and Medicine (1999); Introduction to Computational Molecular Biology (Setubal et 
al., eds 1997); Bioinformatics: Methods and Protocols (Misener & Krawetz, eds, 2000); 
Bioinformatics: Sequence, Structure, and Databanks: A Practical Approach (Higgins & 
Taylor, eds., 2000); Brown, Bioinformatics: A Biologist's Guide to Biocomputing and the 
Internet (2001); Han & Kamber, Data Mining: Concepts and Techniques (2000); and 

10 Waterman, Introduction to Computational Biology: Maps, Sequences, and Genomes (1995). 

The present invention provides a computer database comprising a computer 
and software for storing in computer-retrievable form assay data records cross-tabulated, e.g., 
with data specifying the source of the target-containing sample from which each sequence 
specificity record was obtained. 

15 In an exemplary embodiment, at least one of the sources of target-containing 

sample is from a control tissue sample known to be free of pathological disorders. In a 
variation, at least one of the sources is a known pathological tissue specimen, e.g., a 
neoplastic lesion or another tissue specimen to be analyzed for prostate cancer. In another 
variation, the assay records cross-tabulate one or more of the following parameters for each 

20 target species in a sample: (1) a unique identification code, which can include, e.g., a target 
molecular structure and/or characteristic separation coordinate (e.g., electrophoretic 
coordinates); (2) sample source; and (3) absolute and/or relative quantity of the target species 
present in the sample. 

The invention also provides for the storage and retrieval of a collection of 

25 target data in a computer data storage apparatus, which can include magnetic disks, optical 
disks, magneto-optical disks, DRAM, SRAM, SGRAM, SDRAM, RDRAM, DDR RAM, 
magnetic bubble memory devices, and other data storage devices, including CPU registers 
and on-CPU data storage arrays. Typically, the target data records are stored as a bit pattern 
in an array of magnetic domains on a magnetizable medium or as an array of charge states or 

30 transistor gate states, such as an array of cells in a DRAM device (e.g., each cell comprised of 
a transistor and a charge storage area, which may be on the transistor). In one embodiment, 
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the invention provides such storage devices, and computer systems built therewith, 
comprising a bit pattern encoding a protein expression fingerprint record comprising unique 
identifiers for at least 10 target data records cross-tabulated with target source. 

When the target is a peptide or nucleic acid, the invention preferably provides 

5 a method for identifying related peptide or nucleic acid sequences, comprising performing a 
computerized comparison between a peptide or nucleic acid sequence assay record stored in 
or retrieved from a computer storage device or database and at least one other sequence. The 
comparison can include a sequence analysis or comparison algorithm or computer program 
embodiment thereof (e.g., FASTA, TFASTA, GAP, BESTPTT) and/or the comparison may 

10 be of the relative amount of a peptide or nucleic acid sequence in a pool of sequences 
determined from a polypeptide or nucleic acid sample of a specimen. 

The invention also preferably provides a magnetic disk, such as an IBM- 
compatible (DOS, Windows, Windows95/98/2000, Windows NT, OS/2) or other format 
(e.g., Iinux, SunOS, Solaris, AIX, SCO Unix, VMS, MV, Macintosh, etc.) floppy diskette or 

15 hard (fixed, Winchester) disk drive, comprising a bit pattern encoding data from an assay of 
the invention in a file format suitable for retrieval and processing in a computerized sequence 
analysis, comparison, or relative quantitation method. 

The invention also provides a network, comprising a plurality of computing 
devices linked via a data link, such as an Ethernet cable (coax or lOBaseT), telephone line, 

20 ISDN line, wireless network, optical fiber, or other suitable signal transmission medium, 
whereby at least one network device (e.g., computer, disk array, etc.) comprises a pattern of 
magnetic domains (e.g., magnetic disk) and/or charge domains (e.g., an array of DRAM 
cells) composing a bit pattern encoding data acquired from an assay of the invention. 

The invention also provides a method for transmitting assay data that includes 

25 generating an electronic signal on an electronic communications device, such as a modem, 
ISDN terminal adapter, DSL, cable modem, ATM switch, or the like, wherein the signal 
includes (in native or encrypted format) a bit pattern encoding data from an assay or a 
database comprising a plurality of assay results obtained by the method of the invention. 

In a preferred embodiment, the invention provides a computer system for 

30 comparing a query target to a database containing an array of data structures, such as an assay 
result obtained by the method of the invention, and ranking database targets based on the 
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degree of identity and gap weight to the target data. A central processor is preferably 
initialized to load and execute the computer program for alignment and/or comparison of the 
assay results. Data for a query target is entered into the central processor via an I/O device. 
Execution of the computer program results in the central processor retrieving the assay data 

5 from the data file, which comprises a binary description of an assay result. 

The target data or record and the computer program can be transferred to 
secondary memory, which is typically random access memory (e.g., DRAM, SRAM, 
SGRAM, or SDRAM). Targets are ranked according to the degree of correspondence 
between a selected assay characteristic (e.g., binding to a selected affinity moiety) and the 

10 same characteristic of the query target and results are output via an VO device. For example, 
a central processor can be a conventional computer (e.g., Intel Pentium, PowerPC, Alpha, 
PA-8000, SPARC, MIPS 4400, MIPS 10000, VAX, etc.); a program can be a commercial or 
public domain molecular biology software package (e.g., UWGCG Sequence Analysis 
Software, Darwin); a data file can be an optical or magnetic disk, a data server, a memory 

15 device (e.g., DRAM, SRAM, SGRAM, SDRAM, EPROM, bubble memory, flash memory, 
etc.); an I/O device can be a terminal comprising a video display and a keyboard, a modem, 
an ISDN terminal adapter, an Ethernet port, a punched card reader, a magnetic strip reader, or 
other suitable I/O device. 

The invention also preferably provides the use of a computer system, such as 

20 that described above, which comprises: (1) a computer, (2) a stored bit pattern encoding a 
collection of peptide sequence specificity records obtained by the methods of the invention, 
which may be stored in the computer; (3) a comparison target, such as a query target; and (4) 
a program for alignment and comparison, typically with rank-ordering of comparison results 
on the basis of computed similarity values. : 

25 

Characteristics of prostate cancer-associated proteins 

Prostate cancer proteins of the present invention may be classified as secreted 
proteins, transmembrane proteins or intracellular proteins. In one embodiment, the prostate 
cancer protein is an intracellular protein. Intracellular proteins may be found in the 
30 cytoplasm and/or in the nucleus. Intracellular proteins are involved in all aspects of cellular 
function and replication (including, e.g., signaling pathways); aberrant expression of such 
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proteins often results in unregulated or disregulated cellular processes (see, e.g., Molecular 
Biology of the Cell (Alberts, ed., 3rd ed., 1994). For example, many intracellular proteins 
have enzymatic activity such as protein kinase activity, protein phosphatase activity, protease 
activity, nucleotide cyclase activity, polymerase activity and the like. Intracellular proteins 
5 also serve as docking proteins that are involved in organizing complexes of proteins, or 
targeting proteins to various subcellular localizations, and are involved in maintaining the 
structural integrity of organelles. 

An increasingly appreciated concept in characterizing proteins is the presence 
in the proteins of one or more motifs for which defined functions have been attributed In 

10 addition to the highly conserved sequences found in the enzymatic domain of proteins, highly 
conserved sequences have been identified in proteins that are involved in protein-protein 
interaction. For example, Src-homology-2 (SH2) domains bind tyrosine-phosphorylated 
targets in a sequence dependent manner. PTB domains, which are distinct from SH2 
domains, also bind tyrosine phosphorylated targets. SH3 domains bind to proline-rich 

15 targets. In addition, PH domains, tetratricopeptide repeats and WD domains to name only a 
few, have been shown to mediate protein-protein interactions. Some of these may also be 
involved in binding to phospholipids or other second messengers. As will be appreciated by 
one of ordinary skill in the art, these motifs can be identified on the basis of primary 
sequence; thus, an analysis of the sequence of proteins may provide insight into both the 

20 enzymatic potential of the molecule and/or molecules with which the protein may associate. 
One useful database is Pfam (protein families), which is a large collection of multiple 
sequence alignments and hidden Markov models covering many common protein domains. 
Versions are available via the internet from Washington University in St Louis, the Sanger 
Center in England, and the Karolinska Institute in Sweden (see, e.g., Bateman et dL 9 Nuc. 

25 Acids Res. 28:263-266 (2000); Sonnhammer et aL, Proteins 28:405-420 (1997); Bateman et 
al., Nuc. Acids Res. 27:260-262 (1999); and Sonnhammer et dL, Nuc. Acids Res. 26:320-322- 
(1998)). 

In another embodiment, the prostate cancer sequences are transmembrane 
proteins. Transmembrane proteins are molecules that span a phospholipid bilayer of a cell. 
30 They may have an intracellular domain, an extracellular domain, or both. The intracellular 
domains of such proteins may have a number of functions including those already described 
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for intracellular proteins. For example, the intracellular domain may have enzymatic activity 
and/or may serve as a binding site for additional proteins. Frequently the intracellular 
domain of transmembrane proteins serves both roles. For example certain receptor tyrosine 
kinases have both protein kinase activity and SH2 domains. In addition, autophosphorylation 
5 of tyrosines on the receptor molecule itself, creates binding sites for additional SH2 domain 
containing proteins. 

Transmembrane proteins may contain from one to many transmembrane 
domains. For example, receptor tyrosine kinases, certain cytokine receptors, receptor 
guanylyl cyclases and receptor serine/threonine protein kinases contain a single 

10 transmembrane domain. However, various other proteins including channels and adenylyl 
cyclases contain numerous transmembrane domains. Many important cell surface receptors 
such as G protein coupled receptors (GPCRs) are classified as "seven transmembrane 
domain" proteins, as they contain 7 membrane spanning regions. Characteristics of 
transmembrane domains include approximately 20 consecutive hydrophobic amino acids that 

15 may be followed by charged amino acids. Therefore, upon analysis of the amino acid 
sequence of a particular protein, the localization and number of transmembrane domains 
within the protein may be predicted (see, e.g. PSORT web site http://psort.nibb.ac.jp/). 
Important transmembrane protein receptors include, but are not limited to the insulin 
receptor, insulin-like growth factor receptor, human growth hormone receptor, glucose 

20 transporters, transferrin receptor, epidermal growth factor receptor, low density lipoprotein 
receptor, epidermal growth factor receptor, leptin receptor, interleukin receptors, e.g. IL-1 
receptor, IL-2 receptor, 

The extracellular domains of transmembrane proteins are diverse; however, 
conserved motifs are found repeatedly among various extracellular domains. Conserved 

25 structure and/or functions have been ascribed to different extracellular motifs. Many 

extracellular domains are involved in binding to other molecules. In one aspect, extracellular 
domains are found on receptors. Factors that bind the receptor domain include circulating 
ligands, which may be peptides, proteins, or small molecules such as adenosine and the like. 
For example, growth factors such as EGF, FGF and PDGF are circulating growth factors that 

30 bind to their cognate receptors to initiate a variety of cellular responses. Other factors include 
cytokines, mitogenic factors, neurotrophic factors and the like. Extracellular domains also 
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bind to cell-associated molecules. In this respect, they mediate cell-cell interactions. Cell- 
associated ligands can be tethered to the cell, e.g., via a glycosylphosphatidylinositol (GPI) 
anchor, or may themselves be transmembrane proteins. Extracellular domains also associate 
with the extracellular matrix and contribute to the maintenance of the cell structure. 
5 Prostate cancer proteins that are transmembrane are particularly preferred in 

the present invention as they are readily accessible targets for immunotherapeutics, as are 
described herein. In addition, as outlined below, transmembrane proteins can be also useful 
in imaging modalities. Antibodies may be used to label such readily accessible proteins in 
situ. Alternatively, antibodies can also label intracellular proteins, in which case samples are 

10 typically permeablized to provide access to intracellular proteins. 

It will also be appreciated by those in the art that a transmembrane protein can 
be made soluble by removing transmembrane sequences, e.g., through recombinant methods. 
Furthermore, transmembrane proteins that have been made soluble can be made to be 
secreted through recombinant means by adding an appropriate signal sequence. 

15 In another embodiment, the prostate cancer proteins are secreted proteins; the 

secretion of which can be either constitutive or regulated. These proteins have a signal 
peptide or signal sequence that targets the molecule to the secretory pathway. Secreted 
proteins are involved in numerous physiological events; by virtue of their circulating nature, 
they serve to transmit signals to various other cell types. The secreted protein may function in 

20 an autocrine manner (acting on the cell that secreted the factor), a paracrine manner (acting 
on cells in close proximity to the cell that secreted the factor) or an endocrine manner (acting 
on cells at a distance). Thus secreted molecules find use in modulating or altering numerous 
aspects of physiology. Prostate cancer proteins that are secreted proteins are particularly 
preferred in the present invention as they serve as good targets for diagnostic markers, e.g., 

25 for blood, plasma, serum, or stool tests. 

Use of prostate cancer nucleic acids 

As described above, prostate cancer sequence is initially identified by 
substantial nucleic acid and/or amino acid sequence homology or linkage to the prostate 
30 cancer sequences outlined herein. Such homology can be based upon the overall nucleic acid 
or amino acid sequence, and is generally determined as outlined below, using either 

38 



WO 02/30268 



PCT/US01/32045 



homology programs or hybridization conditions. Typically, linked sequences on a mRNA are 
found on the same molecule. 

The prostate cancer nucleic acid sequences of the invention, e.g., the 
sequences in Tables 1-16, can be fragments of larger genes, i.e., they are nucleic acid 
5 segments. "Genes" in this context includes coding regions, non-coding regions, and mixtures 
of coding and non-coding regions. Accordingly, as will be appreciated by those in the art, 
using the sequences provided herein, extended sequences, in either direction, of the prostate 
cancer genes can be obtained, using techniques well known in the art for cloning either longer 
sequences or the full length sequences; see Ausubel, et al., supra. Much can be done by 

10 informatics and many sequences can be clustered to include multiple sequences 
corresponding to a single gene, e.g., systems such as UniGene (see, 
http^/www.ncbi.nlm.nih.gov/UniGene/). 

Once the prostate cancer nucleic acid is identified, it can be cloned and, if 
necessary, its constituent parts recombined to form the entire prostate cancer nucleic acid 

15 coding regions or the entire mRNA sequence. Once isolated from its natural source, e.g., 
contained within a plasmid or other vector or excised therefrom as a linear nucleic acid 
segment, the recombinant prostate cancer nucleic acid can be further-used as a probe to 
identify and isolate other prostate cancer nucleic acids, e.g., extended coding regions. It can 
also be used as a "precursor" nucleic acid to make modified or variant prostate cancer nucleic 

20 acids and proteins. 

The prostate cancer nucleic acids of the present invention are used in several 
ways. In a first embodiment, nucleic acid probes to the prostate cancer nucleic acids are 
made and attached to biochips to be used in screening and diagnostic methods, as outlined 
below, or for administration, e.g., for gene therapy, vaccine, and/or antisense applications. 

25 Alternatively, the prostate cancer nucleic acids that include coding regions of prostate cancer 
proteins can be put into expression vectors for the expression of prostate cancer proteins, 
again for screening purposes or for administration to a patient 

In a preferred embodiment, nucleic acid probes to prostate cancer nucleic 
acids (both the nucleic acid sequences outlined in the figures and/or the complements thereof) 

30 are made. The nucleic acid probes attached to the biochip are designed to be substantially 
complementary to the prostate cancer nucleic acids, le. the target sequence (either the target 
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sequence of the sample or to other probe sequences, e.g., in sandwich assays), such that 
hybridization of the target sequence and the probes of the present invention occurs. As 
outlined below, this complementarity need not be perfect; there may be any number of base 
pair mismatches which will interfere with hybridization between the target sequence and the 
5 single stranded nucleic acids of the present invention. However, if the number of mutations 
is so great that no hybridization can occur under even the least stringent of hybridization 
conditions, the sequence is not a complementary target sequence. Thus, by "substantially 
complementary" herein is meant that the probes are sufficiently complementary to the target 
sequences to hybridize under normal reaction conditions, particularly high stringency 

10 conditions, as outlined herein. 

A nucleic acid probe is generally single stranded but can be partially single 
and partially double stranded. The strandedness of the probe is dictated by the structure, 
composition, and properties of the target sequence. In general, the nucleic acid probes range 
from about 8 to about 100 bases long, with from about 10 to about 80 bases being preferred, 

15 and from about 30 to about 50 bases being particularly preferred. That is, generally whole 
genes are not used. In some embodiments, much longer nucleic acids can be used, up to 
hundreds of bases. 

In a preferred embodiment, more than one probe per sequence is used, with 
either overlapping probes or probes to different sections of the target being used. That is, 

20 two, three, four or more probes, with three being preferred, are used to build in a redundancy 
for a particular target The probes can be overlapping (i.e., have some sequence in common), 
or separate. In some cases, PCR primers may be used to amplify signal for higher sensitivity. 

As will be appreciated by those in the art, nucleic acids can be attached or 
immobilized to a solid support in a wide variety of ways. By "immobilized" and grammatical 

25 equivalents herein is meant the association or binding between the nucleic acid probe and the 
solid support is sufficient to be stable under the conditions of binding, washing, analysis, and 
removal as outlined below. The binding can typically be covalent or non-covalent. By"non- 
covalent binding" and grammatical equivalents herein is meant one or more of electrostatic, 
hydrophilic, and hydrophobic interactions. Included in non-covalent binding is the covalent 

30 attachment of a molecule, such as, streptavidin to the support and the non-covalent binding of 
the biotinylated probe to the streptavidin. By "covalent binding" and grammatical 
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equivalents herein is meant that the two moieties, the solid support and the probe, are 
attached by at least one bond, including sigma bonds, pi bonds and coordination bonds. 
Covalent bonds can be formed directly between the probe and the solid support or can be 
formed by a cross linker or by inclusion of a specific reactive group on either the solid 

5 support or the probe or both molecules. Immobilization may also involve a combination of 
covalent and non-covalent interactions. 

In general, the probes are attached to the biochip in a wide variety of ways, as 
will be appreciated by those in the art As described herein, the nucleic acids can either be 
synthesized first, with subsequent attachment to the biochip, or can be directly synthesized on 

10 the biochip. 

The biochip comprises a suitable solid substrate. By "substrate" or "solid 
support" or other grammatical equivalents herein is meant a material that can be modified to 
contain discrete individual sites appropriate for the attachment or association of the nucleic 
acid probes and is amenable to at least one detection method. As will be appreciated by those 

15 in the art, the number of possible substrates are very large, and include, but are not limited to, 
glass and modified or functionalized glass, plastics (including acrylics, polystyrene and 
copolymers of styrene and other materials, polypropylene, polyethylene, polybutylene, 
polyurethanes, TeflonJ, etc.), polysaccharides, nylon or nitrocellulose, resins, silica or silica- 
based materials including silicon and modified silicon, carbon, metals, inorganic glasses, 

20 plastics, etc. In general, the substrates allow optical detection and do not appreciably 

fluoresce. A preferred substrate is described in copending application entitled Reusable Low 
Fluorescent Plastic Biochip, U.S. Application Serial No. 09/270,214, filed March 15, 1999, 
herein incorporated by reference in its entirety. 

Generally the substrate is planar, although as will be appreciated by those in 

25 the art, other configurations of substrates may be used as well. For example, the probes may 
be placed on the inside surface of a tube, for flow-through sample analysis to minimize 
sample volume. Similarly, the substrate may be flexible, such as a flexible foam, including 
closed cell foams made of particular plastics. 

Li a preferred embodiment, the surface of the biochip and the probe may be 

30 derivatized with chemical functional groups for subsequent attachment of the two. Thus, e.g., 
the biochip is derivatized with a chemical functional group including, but not limited to, 
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amino groups, carboxy groups, oxo groups and thiol groups, with amino groups being 
particularly preferred. Using these functional groups, the probes can be attached using 
functional groups on the probes. For example, nucleic acids containing amino groups can be 
attached to surfaces comprising amino groups, e.g. using linkers as are known in the art; e.g., 
5 homo-or hetero-bifunctional linkers as are well known {see 1994 Pierce Chemical Company 
catalog, technical section on cross-linkers, pages 155-200). In addition, in some cases, 
additional linkers, such as alkyl groups (including substituted and heteroalkyl groups) may be 
used 

In this embodiment, oligonucleotides are synthesized as is known in the art, 

10 and then attached to the surface of the solid support. As will be appreciated by those skilled 
in the art, either the 5' or 3' terminus may be attached to the solid support, or attachment may 
be via an internal nucleoside. 

In another embodiment, the immobilization to the solid support may be very 
strong, yet non-covalent. For example, biotinylated oligonucleotides can be made, which 

15 bind to surfaces covalently coated with streptavidin, resulting in attachment 

Alternatively, the oligonucleotides may be synthesized on the surface, as is 
known in the art For example, photoactivation techniques utilizing photopolymerization 
compounds and techniques are used In a preferred embodiment, the nucleic acids can be 
synthesized in situ, using well known photolithographic techniques, such as those described 

20 in WO 95/25116; WO 95/35505; U.S. Patent Nos. 5,700,637 and 5,445,934; and references 
cited within, all of which are expressly incorporated by reference; these methods of 
attachment form the basis of the Affimetrix GeneChip™ technology. 

Often, amplification-based assays are performed to measure the expression 
level of prostate cancer-associated sequences. These assays are typically performed in 

25 conjunction with reverse transcription. In such assays, a prostate cancer-associated nucleic 
acid sequence acts as a template in an amplification reaction (e.g., Polymerase Chain 
Reaction, or PGR). In a quantitative amplification, the amount of amplification product will 
be proportional to the amount of template in the original sample. Comparison to appropriate 
controls provides a measure of the amount of prostate cancer-associated RNA Methods of 

30 quantitative amplification are well known to those of skill in the art. Detailed protocols for 
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quantitative PCR are provided, e.g., in Innis et al 9 PCR Protocols, A Guide to Methods and 
Applications (1990). 

In some embodiments, a TaqMan based assay is used to measure expression. 
TaqMan based assays use a fluorogenic oligonucleotide probe that contains a 5' fluorescent 
5 dye and a 3' quenching agent. The probe hybridizes to a PCR product, but cannot itself be 
extended due to a blocking agent at the 3' end. When the PCR product is amplified in 
subsequent cycles, the 5' nuclease activity of the polymerase, e.g., AmpliTaq, results in the 
cleavage of the TaqMan probe. This cleavage separates the 5* fluorescent dye and the 3' 
quenching agent, thereby resulting in an increase in fluorescence as a function of 

10 amplification (see, e.g., literature provided by Perkin-Elmer, e.g. y www2.perkin-elmer.com). 

Other suitable amplification methods include, but are not limited to, ligase 
chain reaction (LCR) (see Wu & Wallace, Genomics 4:560 (1989), Landegren et ci., Science 
241: 1077 (1988), and Barringer et al. y Gene 89: 1 17 (1990)), transcription amplification 
(Kwoh et al. 9 Proc. Natl Acad Set USA 86:1173 (1989)), self-sustained sequence replication 

15 (Guatelli et aL, Proa Nat, Acad. ScL USA 87:1874 (1990)), dot PCR, and linker adapter PCR, 
etc. 

Expression of prostate cancer proteins from nucleic acids 

In a preferred embodiment, prostate cancer nucleic acids, e.g., encoding 

20 prostate cancer proteins are used to make a variety of expression vectors to express prostate 
cancer proteins which can then be used in screening assays, as described below. Expression 
vectors and recombinant DNA technology are well known to those of skill in the art (see, 
e.g., Ausubel, supra, and Gene Expression Systems (Fernandez & Hoeffler, eds, 1999)) and 
are used to express proteins. The expression vectors may be either self-replicating 

25 extrachromosomal vectors or vectors which integrate into a host genome. Generally, these 
expression vectors include transcriptional and translational regulatory nucleic acid operably 
linked to the nucleic acid encoding the prostate cancer protein. The term "control sequences" 
refers to DNA sequences used for the expression of an operably linked coding sequence in a 
particular host organism. Control sequences that are suitable for prokaryotes, e.g., include a 

30 promoter, optionally an operator sequence, and a ribosome binding site. Eukaryotic cells are 
known to utilize promoters, polyadenylation signals, and enhancers. 
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Nucleic acid is "operably linked" when it is placed into a functional 
relationship with another nucleic acid sequence. For example, DNA for a presequence or 
secretory leader is operably linked to DNA for a polypeptide if it is expressed as a preprotein 
that participates in the secretion of the polypeptide; a promoter or enhancer is operably linked 

5 to a coding sequence if it affects the transcription of the sequence; or a ribosome binding site 
is operably linked to a coding sequence if it is positioned so as to facilitate translation. 
Generally, "operably linked" means that the DNA sequences being linked are contiguous, 
and, in the case of a secretory leader, contiguous and in reading phase. However, enhancers 
do not have to be contiguous. Linking is typically accomplished by ligation at convenient 

10 restriction sites. If such sites do not exist, synthetic oligonucleotide adaptors or linkers are 
used in accordance with conventional practice. Transcriptional and translational regulatory 
nucleic acid will generally be appropriate to the host cell used to express the prostate cancer 
protein. Numerous types of appropriate expression vectors, and suitable regulatory 
sequences are known in the art for a variety of host cells. 

15 In general, transcriptional and translational regulatory sequences may include, 

but are not limited to, promoter sequences, ribosomal binding sites, transcriptional start and 
stop sequences, translational start and stop sequences, and enhancer or activator sequences. 
In a preferred embodiment, the regulatory sequences include a promoter and transcriptional 
start and stop sequences. 

20 Promoter sequences encode either constitutive or inducible promoters. The 

promoters may be either naturally occurring promoters or hybrid promoters. Hybrid 
promoters, which combine elements of more than one promoter, are also known in the art, 
and are useful in the present invention. 

In addition, an expression vector may comprise additional elements. For 

25 example, the expression vector may have two replication systems, thus allowing it to be 
maintained in two organisms, e.g. in mammalian or insect cells for expression and in a 
procaryotic host for cloning and amplification. Furthermore, for integrating expression 
vectors, the expression vector contains at least one sequence homologous to the host cell 
genome, and preferably two homologous sequences which flank the expression construct. 

30 The integrating vector may be directed to a specific locus in the host cell by selecting the 
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appropriate homologous sequence for inclusion in the vector. Constructs for integrating 
vectors are well known in the art (e.g., Fernandez & Hoeffler, supra). 

In addition, in a preferred embodiment, the expression vector contains a 
selectable marker gene to allow the selection of transformed host cells. Selection genes are 

5 well known in the art and will vary with the host cell used. 

The prostate cancer proteins of the present invention are produced by culturing 
a host cell transformed with an expression vector containing nucleic acid encoding a prostate 
cancer protein, under the appropriate conditions to induce or cause expression of the prostate 
cancer protein. Conditions appropriate for prostate cancer protein expression will vary with 

10 the choice of the expression vector and the host cell, and will be easily ascertained by one 
skilled in the art through routine experimentation or optimization. For example, the use of 
constitutive promoters in the expression vector will require optimizing the growth and 
proliferation of the host cell, while the use of an inducible promoter requires the appropriate 
growth conditions for induction. In addition, in some embodiments, the timing of the harvest 

15 is important. For example, the baculoviral systems used in insect cell expression are lytic 
viruses, and thus harvest time selection can be crucial for product yield. 

Appropriate host cells include yeast, bacteria, archaebacteria, fungi, and insect 
and animal cells, including mammalian cells. Of particular interest are Saccharomyces 
cerevisiae and other yeasts, E. coli, Bacillus subtilis, Sf9 cells, C129 cells, 293 cells, 

20 Neurospora, BHK, CHO, COS, HeLa cells, HUVEC (human umbilical vein endothelial 
cells), THP1 cells (a macrophage cell line) and various other human cells and cell lines. 

In a preferred embodiment, the prostate cancer proteins are expressed in 
mammalian cells. Mammalian expression systems are also known in the art, and include 
retroviral and adenoviral systems. One expression vector system is a retroviral vector system 

25 such as is generally described in PCT/US97/01019 and PCT/US97/01048, both of which are 
hereby expressly incorporated by reference. Of particular use as mammalian promoters are 
the promoters from mammalian viral genes, since the viral genes are often highly expressed 
and have a broad host range. Examples include the S V40 early promoter, mouse mammary 
tumor virus LTR promoter, adenovirus major late promoter, herpes simplex virus promoter, 

30 and the CMV promoter (see, e.g., Fernandez & Hoeffler, supra). Typically, transcription 
termination and polyadenylation sequences recognized by mammalian cells are regulatory 
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regions located 3* to the translation stop codon and thus, together with the promoter elements, 
flank the coding sequence. Examples of transcription terminator and polyadenlyation signals 
include those derived form SV40. 

The methods of introducing exogenous nucleic acid into mammalian hosts, as 
5 well as other hosts, is well known in the art, and will vary with the host cell used. 
Techniques include dextran-mediated transfection, calcium phosphate precipitation, 
polybrene mediated transfection, protoplast fusion, electroporation, viral infection, 
encapsulation of the polynucleotide^) in liposomes, and direct microinjection of the DNA 
into nuclei. , 

10 In a preferred embodiment, prostate cancer proteins are expressed in bacterial 

systems. Bacterial expression systems are well known in the art Promoters from 
bacteriophage may also be used and are known in the art. In addition, synthetic promoters 
and hybrid promoters are also useful; e.g., the tac promoter is a hybrid of the trp and lac 
promoter sequences. Furthermore, a bacterial promoter can include naturally occurring 

15 promoters of non-bacterial origin that have the ability to bind bacterial RNA polymerase and 
initiate transcription. In addition to a functioning promoter sequence, an efficient ribosome 
binding site is desirable. The expression vector may also include a signal peptide sequence 
that provides for secretion of the prostate cancer protein in bacteria. The protein is either 
secreted into the growth media (gram-positive bacteria) or into the periplasmic space, located 

20 between the inner and outer membrane of the cell (gram-negative bacteria). The bacterial 
expression vector may also include a selectable marker gene to allow for the selection of 
bacterial strains that have been transformed. Suitable selection genes include genes which 
render the bacteria resistant to drugs such as ampicillin, chloramphenicol, erythromycin, 
kanamycin, neomycin and tetracycline. Selectable markers also include biosynthetic genes, 

25 such as those in the histidine, tryptophan and leucine biosynthetic pathways. These 

components are assembled into expression vectors. Expression vectors for bacteria are well 
known in the art, and include vectors for Bacillus subtilis, E. coli, Streptococcus cremoris, 
and Streptococcus lividans, among others (e.g., Fernandez & Hoeffler, supra). The bacterial 
expression vectors are transformed into bacterial host cells using techniques well known in 

30 the art, such as calcium chloride treatment, electroporation, and others. 
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In one embodiment, prostate cancer proteins are produced in insect cells; 
Expression vectors for the transformation of insect cells, and in particular, baculovirus-based 
expression vectors, are well known in the art. 

In a preferred embodiment, prostate cancer protein is produced in yeast cells. 

5 Yeast expression systems are well known in the art, and include expression vectors for 
Saccharomyces cerevisiae, Candida albicans and C maltosa, Hansenula polymorpha, 
Kluyveromyces fragilis and K. lactis, Pichia guillerimondU and P. pastoris, 
Schizosaccharomyces pombe, and Yarrowia lipolytica. 

The prostate cancer protein may also be made as a fusion protein, using 

10 techniques well known in the art Thus, e.g., for the creation of monoclonal antibodies, if the 
desired epitope is small, the prostate cancer protein may be fused to a carrier protein to form 
an immunogen. Alternatively, the prostate cancer protein may be made as a fusion protein to 
increase expression, or for other reasons. For example, when the prostate cancer protein is a 
prostate cancer peptide, the nucleic acid encoding the peptide may be linked to other nucleic 

15 acid for expression purposes. 

In a preferred embodiment, the prostate cancer protein is purified or isolated 
after expression. Prostate cancer proteins may be isolated or purified in a variety of ways 
known to those skilled in the art depending on what other components are present in the 
sample. Standard purification methods include electrophoretic, molecular, immunological 

20 and chromatographic techniques, including ion exchange, hydrophobic, affinity, and reverse- 
phase HPLC chromatography, and chromatofocusing. For example, the prostate cancer 
protein may be purified using a standard anti-prostate cancer protein antibody column. 
Ultrafiltration and diafiltration techniques, in conjunction with protein concentration, are also 
useful. For general guidance in suitable purification techniques, see Scopes, Protein 

25 Purification (1982). The degree of purification necessary will vary depending on the use of 
the prostate cancer protein. In some instances no purification will be necessary. 

Once expressed and purified if necessary, the prostate cancer proteins and 
nucleic acids are useful in a number of applications. They may be used as immunoselection 
reagents, as vaccine reagents, as screening agents, etc. 

30 
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Variants of prostate cancer proteins 

In one embodiment, the prostate cancer proteins are derivative or variant 
prostate cancer proteins as compared to the wild-type sequence. That is, as outlined more 
fully below, the derivative prostate cancer peptide will often contain at least one amino acid 
5 substitution, deletion or insertion, with amino acid substitutions being particularly preferred. 
The amino acid substitution, insertion or deletion may occur at any residue within the 
prostate cancer peptide. 

Also included within one embodiment of prostate cancer proteins of the 
present invention are amino acid sequence variants. These variants typically fall into one or 

10 more of three classes: substitutional, insertional or deletional variants. These variants 

ordinarily are prepared by site specific mutagenesis of nucleotides in the DNA encoding the 
prostate cancer protein, using cassette or PGR mutagenesis or other techniques well known in 
the art, to produce DNA encoding the variant, and thereafter expressing the DNA in 
recombinant cell culture as outlined above. However, variant prostate cancer protein 

15 fragments having up to about 100-150 residues may be prepared by in vitro synthesis using 
established techniques. Amino acid sequence variants are characterized by the predetermined 
nature of the variation, a feature that sets them apart from naturally occurring allelic or 
interspecies variation of the prostate cancer protein amino acid sequence. The variants 
typically exhibit the same qualitative biological activity as the naturally occurring analogue, 

20 ..although variants can also be selected which have modified characteristics as will be more 
fully outlined below. 

While the site or region for introducing an amino acid sequence variation is 
predetermined, the mutation per se need not be predetermined. For example, in order to 
optimize the performance of a mutation at a given site, random mutagenesis may be 

25 conducted at the target codon or region and the expressed prostate cancer variants screened 
for the optimal combination of desired activity. Techniques for making substitution 
mutations at predetermined sites in DNA having a known sequence are well known, e.g., 
M13 primer mutagenesis and PCR mutagenesis. Screening of the mutants is done using 
assays of prostate cancer protein activities. 

30 Amino acid substitutions are typically of single residues; insertions usually 

will be on the order of from about 1 to 20 amino acids, although considerably larger 
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insertions may be tolerated. Deletions range from about 1 to about 20 residues, although in 
some cases deletions may be much larger. 

Substitutions, deletions, insertions or any combination thereof may be used to 
arrive at a final derivative. Generally these changes are done on a few amino acids to 

5 minimize the alteration of the molecule. However, larger changes may be tolerated in certain 
circumstances. When small alterations in the characteristics of the prostate cancer protein are 
desired, substitutions are generally made in accordance with the amino acid substitution 
relationships provided in the definition section. 

The variants typically exhibit the same qualitative biological activity and will 

10 elicit the same immune response as the naturally-occurring analog, although variants also are 
selected to modify the characteristics of the prostate cancer proteins as needed. Alternatively, 
the variant may be designed such that the biological activity of the prostate cancer protein is 
altered. For example, glycosylation sites may be altered or removed. 

Substantial changes in function or immunological identity are made by 

15 selecting substitutions that are less conservative than those described above. For example, 
substitutions may be made which more significantly affect: the structure of the polypeptide 
backbone in the area of the alteration, for example the alpha-helical or beta-sheet structure; 
the charge or hydrophobicity of the molecule at the target site; or the bulk of the side chain. 
The substitutions which in general are expected to produce the greatest changes in the 

20 polypeptide's properties are those in which (a) a hydrophilic residue, e.g. seiyl or threonyl is 
substituted for (or by) a hydrophobic residue, e.g. leucyl, isoleucyl, phenylalanyl, valyl or 
alanyl; (b) a cysteine or proline is substituted for (or by) any other residue; (c) a residue 
having an electropositive side chain, e.g. lysyl, arginyl, or histidyl, is substituted for (or by) 
an electronegative residue, e.g. glutamyl or aspartyl; or (d) a residue having a bulky side 

25 chain, e.g. phenylalanine, is substituted for (or by) one not having a side chain, e.g. glycine. 

Covalent modifications of prostate cancer polypeptides are included within the 
scope of this invention. One type of covalent modification includes reacting targeted amino 
acid residues of a prostate cancer polypeptide with an organic derivatizing agent that is 
capable of reacting with selected side chains or the N-or C-terminal residues of a prostate 

30 cancer polypeptide. Derivatization with bifunctional agents is useful, for instance, for 

crosslinking prostate cancer polypeptides to a water-insoluble support matrix or surface for 
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use in the method for purifying anti-prostate cancer polypeptide antibodies or screening 
assays, as is more fully described below. Commonly used crosslinking agents include, e.g., 
l,l-bis(diazoacetyl)-2-phenylethane, glutaraldehyde, N-hydroxysuccinimide esters, e.g., 
esters with 4-azidosalicyIic acid, homobifunctional imidoesters, including disuccinimidyl 
5 esters such as 3,3'-dithiobis(succinimidylpropionate), Afunctional maleimides such as bis-N- 
maleimido-l,8-octane and agents such as methyl-3-((p-azidophenyl)dithio)propioimidate. 

Other modifications include deamidation of glutaminyl and asparaginyl 
residues to the corresponding glutamyl and aspartyl residues, respectively, hydroxylation of 
proline and lysine, phosphorylation of hydroxyl groups of seryl, threonyl or tyrosyl residues, 

10 methylation of the amino groups of the lysine, arginine, and histidine side chains (Creighton, 
Proteins: Structure and Molecular Properties, pp. 79-86 (1983)), acetylation of the N- 
terminal amine, and amidation of any C-tenninal carboxyl group. 

Another type of covalent modification of the prostate cancer polypeptide 
included within the scope of this invention comprises altering the native glycosylation pattern 

15 of the polypeptide. "Altering the native glycosylation pattern" is intended for purposes 

herein to mean deleting one or more carbohydrate moieties found in native sequence prostate 
cancer polypeptide, and/or adding one or more glycosylation sites that are not present in the 
native sequence prostate cancer polypeptide. Glycosylation patterns can be altered in many 
ways. For example the use of different cell types to express prostate cancer-associated 

20 sequences can result in different glycosylation patterns. 

Addition of glycosylation sites to prostate cancer polypeptides may also be 
accomplished by altering the amino acid sequence thereof. The alteration may be made, e.g., 
by the addition of, or substitution by, one or more serine or threonine residues to the native 
sequence prostate cancer polypeptide (for O-linked glycosylation sites). The prostate cancer 

25 amino acid sequence may optionally be altered through changes at the DNA level, 

particularly by mutating the DNA encoding the prostate cancer polypeptide at preselected 
bases such that codons are generated that will translate into the desired amino acids. 

Another means of increasing the number of carbohydrate moieties on the 
prostate cancer polypeptide is by chemical or enzymatic coupling of glycosides to the 

30 polypeptide. Such methods are described in the art, e.g., in WO 87/05330, and in Aplin & 
Wriston, CRC Crit. Rev. Biochem., pp. 259-306 (1981). 
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Removal of carbohydrate moieties present on the prostate cancer polypeptide 
may be accomplished chemically or enzymatically or by mutational substitution of codons 
encoding for amino acid residues that serve as targets for glycosylation. Chemical 
deglycosylation techniques are known in the art and described, for instance, by Hakimuddin, 
5 etal.,Arck Biochem. Biophys., 259:52 (1987) and by Edge etal. y Anal. Biochenu, 118:131 
(1981). Enzymatic cleavage of carbohydrate moieties on polypeptides can be achieved by the 
use of a variety of endo-and exo-glycosidases as described by Thotakura et al.,Meth. 
EnzymoU 138:350 (1987). 

Another type of covalent modification of prostate cancer comprises linking the 

10 prostate cancer polypeptide to one of a variety of nonproteinaceous polymers, e.g., 

polyethylene glycol, polypropylene glycol, or poiyoxyalkylenes, in the manner set forth in 
U.S. Patent Nos. 4,640,835; 4,496,689; 4,301,144; 4,670,417; 4,791,192 or 4,179,337. 

Prostate cancer polypeptides of the present invention may also be modified in 
a way to form chimeric molecules comprising a prostate cancer polypeptide fused to another, 

15 heterologous polypeptide or amino acid sequence. In one embodiment, such a chimeric 
molecule comprises a fusion of a prostate cancer polypeptide with a tag polypeptide which 
provides an epitope to which an anti-tag antibody can selectively bind. The epitope tag is 
generally placed at the amino-or carboxyl-terminus of the prostate cancer polypeptide. The 
presence of such epitope-tagged forms of a prostate cancer polypeptide can be detected using 

20 an antibody against the tag polypeptide. Also, provision of the epitope tag enables the 
prostate cancer polypeptide to be readily purified by affinity purification using an anti-tag 
antibody or another type of affinity matrix that binds to the epitope tag. In an alternative 
embodiment, the chimeric molecule may comprise a fusion of a prostate cancer polypeptide 
with an immunoglobulin or a particular region of an immunoglobulin. For a bivalent form of 

25 the chimeric molecule, such a fusion could be to the Fc region of an IgG molecule. 

Various tag polypeptides and their respective antibodies are well known in the 
art. Examples include poly-histidine (poly-his) or poly-histidine-glycine (poly-his-gly) tags; 
HIS6 and metal chelation tags, the flu HA tag polypeptide and its antibody 12CA5 (Field et 
al., Moi Cell Biol. 8:2159-2165 (1988)); the c-myc tag and the 8F9, 3C7, 6E10, G4, B7 and 

30 9E10 antibodies thereto (Evan et al, Molecular and Cellular Biology 5:3610-3616 (1985)); 
and the Herpes Simplex virus glycoprotein D (gD) tag and its antibody (Paborsky et al., 
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Protein Engineering 3(6):547-553 (1990)). Other tag polypeptides include the Hag-peptide 
(Hopp et al, BioTechnology 6:1204-1210 (1988)); the KT3 epitope peptide (Martin et al, 
Science 255:192-194 (1992)); tubulin epitope peptide (Skinner et al„ I Biol Chem. 
266:15163-15166 (1991)); and the 17 gene 10 protein peptide tag (Lutz-Freyermuth et aL, 
5 Proc. Natl Acad. Set USA 87:6393-6397 (1990)). 

Also included are other prostate cancer proteins of the prostate cancer family, 
and prostate cancer proteins from other organisms, which are cloned and expressed as 
outlined below. Thus, probe or degenerate polymerase chain reaction (PCR) primer 
sequences may be used to find other related prostate cancer proteins from humans or other 

10 organisms. As will be appreciated by those in the art, particularly useful probe and/or PCR 
primer sequences include the unique areas of the prostate cancer nucleic acid sequence. As is 
generally known in the art, preferred PCR primers are from about 15 to about 35 nucleotides 
in length, with from about 20 to about 30 being preferred, and may contain inosine as needed. 
The conditions for the PCR reaction are well known in the art (e.g., Innis, PCR Protocols, 

15 supra). 

Antibodies to prostate cancer proteins 

In a preferred embodiment, when the prostate cancer protein is to be used to 
generate antibodies, e.g., for immunotherapy or immunodiagnosis, the prostate cancer protein 

20 should share at least one epitope or determinant with the full length protein. By "epitope" or 
"determinant" herein is typically meant a portion of a protein which will generate and/or bind 
an antibody or T-cell receptor in the context of MHC. Thus, in most instances, antibodies 
made to a smaller prostate cancer protein will be able to bind to the full-length protein, 
particularly linear epitopes. In a preferred embodiment; the epitope is unique; that is, 

25 antibodies generated to a unique epitope show little or no cross-reactivity. 

Methods of preparing polyclonal antibodies are known to the skilled artisan 
(e.g., Coligan, supra; and Harlow & Lane, supra). Polyclonal antibodies can be raised in a 
mammal, e.g., by one or more injections of an immunizing agent and, if desired, an adjuvant. 
Typically, the immunizing agent and/or adjuvant will be injected in the mammal by multiple 

30 subcutaneous or intraperitoneal injections. The immunizing agent may include a protein 
encoded by a nucleic acid of the figures or fragment thereof or a fusion protein thereof. It 
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may be useful to conjugate the immunizing agent to a protein known to be immunogenic in 
the mammal being immunized. Examples of such immunogenic proteins include but are not 
limited to keyhole limpet hemocyanin, serum albumin, bovine thyroglobulin, and soybean 
trypsin inhibitor. Examples of adjuvants which may be employed include Freund's complete 
5 adjuvant and MPL-TDM adjuvant (monophosphoryl Lipid A, synthetic trehalose 

dicorynomycolate). The immunization protocol may be selected by one skilled in the art 
without undue experimentation. 

The antibodies may, alternatively, be monoclonal antibodies. Monoclonal 
antibodies may be prepared using hybridoma methods, such as those described by Kohler & 

10 Milstein, Nature 256:495 (1975). In a hybridoma method, a mouse, hamster, or other 
appropriate host animal, is typically immunized with an immunizing agent to elicit 
lymphocytes that produce or are capable of producing antibodies that will specifically bind to 
the immunizing agent Alternatively, the lymphocytes may be immunized in vitro. The 
immunizing agent will typically include a polypeptide encoded by a nucleic acid of Tables 1- 

15 16 fragment thereof, or a fusion protein thereof. Generally, either peripheral blood 

lymphocytes ('TBLs") are used if cells of human origin are desired, or spleen cells or lymph 
node cells are used if non-human mammalian sources are desired. The lymphocytes are then 
fused with an immortalized cell line using a suitable fusing agent, such as polyethylene 
glycol, to form a hybridoma cell (Goding, Monoclonal Antibodies: Principles and Practice, 

20 pp. 59-103 (1986)). Immortalized cell lines are usually transformed mammalian cells, 
particularly myeloma cells of rodent, bovine and human origin. Usually, rat or mouse 
myeloma cell lines are employed. The hybridoma cells may be cultured in a suitable culture 
medium that preferably contains one or more substances that inhibit the growth or survival of 
the unfused, immortalized cells. For example, if the parental cells lack the enzyme 

25 hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the culture medium 
for the hybridomas typically will include hypoxanthine, aminopterin, and thymidine ("HAT 
medium"), which substances prevent the growth of HGPRT-deficient cells. 

In one embodiment, the antibodies are bispecific antibodies. Bispecific 
antibodies are monoclonal, preferably human or humanized, antibodies that have binding 

30 specificities for at least two different antigens or that have binding specificities for two 
epitopes on the same antigen. In one embodiment, one of the binding specificities is for a 

53 



WO 02/30268 



PCT/US01/32045 



protein encoded by a nucleic acid Tables 1-16 or a fragment thereof, the other one is for any 
other antigen, and preferably for a cell-surface protein or receptor or receptor subunit, 
preferably one that is tumor specific. Alternatively, tetramer-type technology may create 
multivalent reagents. 

5 In a preferred embodiment, the antibodies to prostate cancer protein are 

capable of reducing or eliminating a biological function of a prostate cancer protein, as is 
described below. That is, the addition of anti-prostate cancer protein antibodies (either 
polyclonal or preferably monoclonal) to prostate cancer tissue (or cells containing prostate 
cancer) may reduce or eliminate the prostate cancer. Generally, at least a 25% decrease in 
10 activity, growth, size or the like is preferred, with at least about 50% being particularly 
preferred and about a 95-100% decrease being especially preferred. 

In a preferred embodiment the antibodies to the prostate cancer proteins are 
humanized antibodies (e.g., Xeneiex Biosciences, Mederex, Inc., Abgenix, Inc., Protein 
Design Labs,Inc.) Humanized forms of non-human (e.g., murine) antibodies are chimeric 
15 molecules of immunoglobulins, immunoglobulin chains or fragments thereof (such as Fv, 
Fab, Fab', F(ab')2 or other antigen-binding subsequences of antibodies) which contain 
minimal sequence derived from non-human immunoglobulin. Humanized antibodies include 
human immunoglobulins (recipient antibody) in which residues from a complementary 
determining region (CDR) of the recipient are replaced by residues from a CDR of a non- 
20 human species (donor antibody) such as mouse, rat or rabbit having the desired specificity, 
affinity and capacity. In some instances, Fv framework residues of the human 
immunoglobulin are replaced by corresponding non-human residues. Humanized antibodies 
may also comprise residues which are found neither in the recipient antibody nor in the 
imported CDR or framework sequences. In general, a humanized antibody will comprise 
25 substantially all of at least one, and typically two, variable domains, in which all or 

substantially all of the CDR regions correspond to those of a non-human immunoglobulin 
and all or substantially all of the framework (FR) regions are those of a human 
immunoglobulin consensus sequence. The humanized antibody optimally also will comprise 
at least a portion of an immunoglobulin constant region (Fc), typically that of a human 
30 immunoglobulin (Jones et al, Nature 321:522-525 (1986); Riechmann et al y Nature 

332:323-329 (1988); andPresta, Curr. Op. Struct Biol 2:593-596 (1992)). Humanization 
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can be essentially performed following the method of Winter and co-workers (Jones et al, 
Nature 321:522-525 (1986); Riechmann et al., Nature 332:323-327 (1988); Verhoeyen et al, 
Science 239:1534-1536 (1988)), by substituting rodent CDRs or CDR sequences for the 
corresponding sequences of a human antibody. Accordingly, such humanized antibodies are 
5 chimeric antibodies (U.S. Patent No. 4,816,567), wherein substantially less than an intact 
human variable domain has been substituted by the corresponding sequence from a non- 
human species. 

Human antibodies can also be produced using various techniques known in the 
art, including phage display libraries (Hoogenboom & Winter, /. Mol Biol. 227:381 (1991); 

10 Marks et al., J. Mol. Biol. 222:581 (1991)). The techniques of Cole et aL and Boerner et al. 
are also available for the preparation of human monoclonal antibodies (Cole et al., 
Monoclonal Antibodies and Cancer Therapy, p. 77 (1985) and Boerner et al., J. Immunol 
147(l):86-95 (1991)). Similarly, human antibodies can be made by introducing of human 
immunoglobulin loci into transgenic animals, e.g., mice in which the endogenous 

15 immunoglobulin genes have been partially or completely inactivated. Upon challenge, 
human antibody production is observed, which closely resembles that seen in humans in all 
respects, including gene rearrangement, assembly, and antibody repertoire. This approach is 
described, e.g., in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 5,633,425; 
5,661,016, and in the following scientific publications: Marks et al., Bio/Technology 10:779- 

20 783 (1992); Lonberg et al, Nature 368:856-859 (1994); Morrison, Nature 368:812-13 
(1994); Fishwild et al., Nature Biotechnology 14:845-51 (1996); Neuberger, Nature 
Biotechnology 14:826 (1996); Lonberg & Huszar, Intern. Rev. Immunol 13:65-93 (1995). 

By immunotherapy is meant treatment of prostate cancer with an antibody 
raised against prostate cancer proteins. As used herein, immunotherapy can be passive or 

25 active. Passive immunotherapy as defined herein is the passive transfer of antibody to a 

recipient (patient). Active immunization is the induction of antibody and/or T-cell responses 
in a recipient (patient). Induction of an immune response is the result of providing the 
recipient with an antigen to which antibodies are raised. As appreciated by one of ordinary 
skill in the art, the antigen may be provided by injecting a polypeptide against which 

30 antibodies are desired to be raised into a recipient, or contacting the recipient with a nucleic 
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acid capable of expressing the antigen and under conditions for expression of the antigen, 
leading to an immune response. 

In a preferred embodiment the prostate cancer proteins against which 
antibodies are raised are secreted proteins as described above. Without being bound by 
5 theory, antibodies used for treatment, bind and prevent the secreted protein from binding to 
its receptor, thereby inactivating the secreted prostate cancer protein. 

In another preferred embodiment, the prostate cancer protein to which 
antibodies are raised is a transmembrane protein. Without being bound by theory, antibodies 
used for treatment, bind the extracellular domain of the prostate cancer protein and prevent it 

10 from binding to other proteins, such as circulating ligands or cell-associated molecules. The 
antibody may cause down-regulation of the transmembrane prostate cancer protein. As will 
be appreciated by one of ordinary skill in the art, the*antibody may be a competitive, non- 
competitive or uncompetitive inhibitor of protein binding to the extracellular domain of the 
prostate cancer protein. The antibody is also an antagonist of the prostate cancer protein. 

15 Further, the antibody prevents activation of the transmembrane prostate cancer protein. In 
one aspect, when the antibody prevents the binding of other molecules to the prostate cancer 
protein, the antibody prevents growth of the cell. The antibody may also be used to target or 
sensitize the cell to cytotoxic agents, including, but not limited to TNF-a, TNF-P, IL-1, INF-y 
and IL-2, or chemotherapeutic agents including 5FU, vinblastine, actinomycin D, cisplatin, 

20 methotrexate, and the like. In some instances the antibody belongs to a sub-type that 
activates serum complement when complexed with the transmembrane protein thereby 
mediating cytotoxicity or antigen-dependent cytotoxicity (ADCC). Thus, prostate cancer is 
treated by administering to a patient antibodies directed against the transmembrane prostate 
cancer protein. Antibody-labeling may activate a co-toxin, localize a toxin payload, or 

25 otherwise provide means to locally ablate cells. 

In another preferred embodiment, the antibody is conjugated to an effector 
moiety. The effector moiety can be any number of molecules, including labelling moieties 
such as radioactive labels or fluorescent labels, or can be a therapeutic moiety. In one aspect 
the therapeutic moiety is a small molecule that modulates the activity of the prostate cancer 

30 protein. In another aspect the therapeutic moiety modulates the activity of molecules 

associated with or in close proximity to the prostate cancer protein. The therapeutic moiety 
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may inhibit enzymatic activity such as protease or collagenase or protein kinase activity 
associated with prostate cancer. 

In a preferred embodiment, the therapeutic moiety can also be a cytotoxic 
agent. In this method, targeting the cytotoxic agent to prostate cancer tissue or cells, results 
5 in a reduction in the number of afflicted cells, thereby reducing symptoms associated with 
prostate cancer. Cytotoxic agents are numerous and varied and include, but are not limited 
to, cytotoxic drugs or toxins or active fragments of such toxins. Suitable toxins and their 
corresponding fragments include diphtheria A chain, exotoxin A chain, ricin A chain, abrin A 
chain, curcin, crotin, phenomycin, enomycin and the like. Cytotoxic agents also include 

10 radiochemicals made by conjugating radioisotopes to antibodies raised against prostate 
cancer proteins, or binding of a radionuclide to a chelating agent that has been covalently 
attached to the antibody. Targeting the therapeutic moiety to transmembrane prostate cancer 
proteins not only serves to increase the local concentration of therapeutic moiety in the 
prostate cancer afflicted area, but also serves to reduce deleterious side effects that may be 

15 associated with the therapeutic moiety. 

In another preferred embodiment, the prostate cancer protein against which the 
antibodies are raised is an intracellular protein. In this case, the antibody may be conjugated 
to a protein which facilitates entry into the cell. In one case, the antibody enters the cell by 
endocytosis. In another embodiment, a nucleic acid encoding the antibody is administered to 

20 the individual or cell. Moreover, wherein the prostate cancer protein can be targeted within a 
cell, i.e., the nucleus, an antibody thereto contains a signal for that target localization, i.e., a 
nuclear localization signal. 

The prostate cancer antibodies of the invention specifically bind to prostate 
cancer proteins. By "specifically bind" herein is meant that the antibodies bind to the protein 

25 with a Kd of at least' about 0. 1 mM, more usually at least about 1 pM, preferably at least about 
0.1 nM or better, and most preferably, 0.01 \iM or better. Selectivity of binding is also 
important. 

Detection of prostate cancer sequence for diagnostic and therapeutic applications 

30 In one aspect, the RNA expression levels of genes are determined for different 

cellular states in the prostate cancer phenotype. Expression levels of genes in normal tissue 
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(i.e., not undergoing prostate cancer) and in prostate cancer tissue (and in some cases, for 
varying severities of prostate cancer that relate to prognosis, as outlined below) are evaluated 
to provide expression profiles. An expression profile of a particular cell state or point of 
development is essentially a "fingerprint" of the state. While two states may have any 
5 particular gene similarly expressed, the evaluation of a number of genes simultaneously 
allows the generation of a gene expression profile that is reflective of the state of the cell. By 
comparing expression profiles of cells in different states, information regarding which genes 
are important (including both up- and down-regulation of genes) in each of these states is 
obtained. Then, diagnosis may be performed or confirmed to determine whether a tissue 

10 sample has the gene expression profile of normal or cancerous tissue. This will provide for 
molecular diagnosis of related conditions. 

"Differential expression," or grammatical equivalents as used herein, refers to 
qualitative or quantitative differences in the temporal and/or cellular gene expression 
patterns within and among cells and tissue. Thus, a differentially expressed gene can 

15 qualitatively have its expression altered, including an activation or inactivation, in, e.g., 
normal versus prostate cancer tissue. Genes may be turned on or turned off in a particular 
state, relative to another state thus permitting comparison of two or more states. A 
qualitatively regulated gene will exhibit an expression pattern within a state or cell type 
which is detectable by standard techniques. Some genes will be expressed in one state or cell 

20 type, but not in both. Alternatively, the difference in expression may be quantitative, e.g., in 
that expression is increased or decreased; i.e., gene expression is either upregulated, resulting 
in an increased amount of transcript, or downregulated, resulting in a decreased amount of 
transcript. The degree to which expression differs need only be large enough to quantify via 
standard characterization techniques as outlined below, such as by use of Affymetrix 

25 GeneChip™ expression arrays, Lockhart, Nature Biotechnology 14:1675-1680 (1996), 

hereby expressly incorporated by reference. Other techniques include, but are not limited to, 
quantitative reverse transcriptase PCR, northern analysis and RNase protection. As outlined 
above, preferably the change in expression (i.e., upregulation or downregulation) is at least 
about 50%, more preferably at least about 100%, more preferably at least about 150%, more 

30 preferably at least about 200%, with from 300 to at least 1000% being especially preferred. 
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Evaluation may be at the gene transcript, or the protein level. The amount of 
gene expression may be monitored using nucleic acid probes to the DNA or RNA equivalent 
of the gene transcript, and the quantification of gene expression levels, or, alternatively, the 
final gene product itself (protein) can be monitored, e.g., with antibodies to the prostate 
5 cancer protein and standard immunoassays (ELISAs, etc.) or other techniques, including 
mass spectroscopy assays, 2D gel electrophoresis assays, etc. Proteins corresponding to 
prostate cancer genes, i.e., those identified as being important in a prostate cancer phenotype, 
can be evaluated in a prostate cancer diagnostic test. 

In a preferred embodiment, gene expression monitoring is performed 

10 simultaneously on a number of genes. Multiple protein expression monitoring can be 

performed as well. Similarly, these assays may be performed on an individual basis as well. 

In this embodiment, the prostate cancer nucleic acid probes are attached to 
biochips as outlined herein for the detection and quantification of prostate cancer sequences 
in a particular cell. The assays are further described below in the example. PCR techniques 

15 can be used to provide greater sensitivity. 

In a preferred embodiment nucleic acids encoding the prostate cancer protein 
are detected. Although DNA or RNA encoding the prostate cancer protein may be detected, 
of particular interest are methods wherein an mRNA encoding a prostate cancer protein is 
detected. Probes to detect mRNA can be a nucleotide/deoxynucleotide probe that is 

20 complementary to and hybridizes with the mRNA and includes, but is not limited to, 

oligonucleotides, cDNA or RNA. Probes also should contain a detectable label, as defined 
herein. In one method the mRNA is detected after immobilizing the nucleic acid to be 
examined on a solid support such as nylon membranes and hybridizing the probe with the 
sample. Following washing to remove the non-specific&lly bound probe, the label is 

25 detected. In another method detection of the mRNA is performed in situ. In this method 
permeabilized cells or tissue samples are contacted with a detectably labeled nucleic acid 
probe for sufficient time to allow the probe to hybridize with the target mRNA. Following 
washing to remove the non-specifically bound probe, the label is detected. For example a 
digoxygenin labeled riboprobe (RNA probe) that is complementary to the mRNA encoding a 

30 prostate cancer protein is detected by binding the digoxygenin with an anti-digoxygenin 
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secondary antibody and developed with nitro blue tetrazolium and 5-bromo-4-chloro-3- 
indoyl phosphate. 

In a preferred embodiment, various proteins from the three classes of proteins 
as described herein (secreted, transmembrane or intracellular proteins) are used in diagnostic 
5 assays. The prostate cancer proteins, antibodies, nucleic acids, modified proteins and cells 
containing prostate cancer sequences are used in diagnostic assays. This can be performed on 
an individual gene or corresponding polypeptide level. In a preferred embodiment, the 
expression profiles are used, preferably in conjunction with high throughput screening 
techniques to allow monitoring for expression profile genes and/or corresponding 
10 polypeptides. 

As described and defined herein, prostate cancer proteins, including 
intracellular, transmembrane or secreted proteins, find use as markers of prostate cancer. 
Detection of these proteins in putative prostate cancer tissue allows for detection or diagnosis 
of prostate cancer. In one embodiment, antibodies are used to detect prostate cancer proteins. 

15 A preferred method separates proteins from a sample by electrophoresis on a gel (typically a 
denaturing and reducing protein gel, but may be another type of gel, including isoelectric 
focusing gels and the like). Following separation of proteins, the prostate cancer protein is 
detected, e.g., by immunoblotting with antibodies raised against the prostate cancer protein. 
Methods of immunoblotting are well known to those of ordinary skill in the art. 

20 In another preferred method, antibodies to the prostate cancer protein find use 

in in situ imaging techniques, e.g., in histology (e.g., Methods in Cell Biology: Antibodies in 
Cell Biology, volume 37 (Asai, ed. 1993)). In this method cells are contacted with from one 
to many antibodies to the prostate cancer protein(s). Following washing to remove non- 
specific antibody binding, the presence of the antibody or antibodies is detected. In one 

25 embodiment the antibody is detected by incubating with a secondary antibody that contains a 
detectable label. In another method the primary antibody to the prostate cancer protein(s) 
contains a detectable label, e.g. an enzyme marker that can act on a substrate. In another 
preferred embodiment each one of multiple primary antibodies contains a distinct and 
detectable label. This method finds particular use in simultaneous screening for a plurality of 

30 prostate cancer proteins. As will be appreciated by one of ordinary skill in the art, many 
other histological imaging techniques are also provided by the invention. 
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In a preferred embodiment the label is detected in a fluorometer which has the 
ability to detect and distinguish emissions of different wavelengths. In addition, a 
fluorescence activated cell sorter (FACS) can be used in the method. 

In another preferred embodiment, antibodies find use in diagnosing prostate 
5 cancer from blood, serum, plasma, stool, and other samples. Such samples, therefore, are 
useful as samples to be probed or tested for the presence of prostate cancer proteins. 
Antibodies can be used to detect a prostate cancer protein by previously described 
immunoassay techniques including ELIS A, immunoblotting (western blotting), 
immunoprecipitation, BIACORE technology and the like. Conversely, the presence of 
10 antibodies may indicate an immune response against an endogenous prostate cancer protein. 

In a preferred embodiment, in situ hybridization of labeled prostate cancer 
nucleic acid probes to tissue arrays is done. For example, arrays of tissue samples, including 
prostate cancer tissue and/or normal tissue, are made. In situ hybridization (see, e.g., 
Ausubel, supra) is then performed. When comparing the fingerprints between an individual 
15 and a standard, the skilled artisan can make a diagnosis, a prognosis, or a prediction based on 
the findings. It is further understood that the genes which indicate the diagnosis may differ 
from those which indicate the prognosis and molecular profiling of the condition of the cells 
may lead to distinctions between responsive or refractory conditions or may be predictive of 
outcomes. 

20 In a preferred embodiment, the prostate cancer proteins, antibodies, nucleic 

acids, modified proteins and cells containing prostate cancer sequences are used in prognosis 
assays. As above, gene expression profiles can be generated that correlate to prostate cancer, 
in terms of long term prognosis. Again, this may be done on either a protein or gene level, 
with the use of genes being preferred. As above, prostate cancer probes may be attached to 

25 biochips for the detection and quantification of prostate cancer sequences in a tissue or 

patient. The assays proceed as outlined above for diagnosis. PCR method may provide more 
sensitive and accurate quantification. 

Assays for therapeutic compounds 
30 In a preferred embodiment members of the proteins, nucleic acids, and 

antibodies as described herein are used in drug screening assays. The prostate cancer 
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proteins, antibodies, nucleic acids, modified proteins and cells containing prostate cancer 
sequences are used in drug screening assays or by evaluating the effect of drug candidates on 
a "gene expression profile" or expression profile of polypeptides. In a preferred embodiment, 
the expression profiles are used, preferably in conjunction with high throughput screening 

5 techniques to allow monitoring for expression profile genes after treatment with a candidate 
agent (e.g., Zlokarnik, et aU Science 279:84-8 (1998); Heid, Genome Res 6:986-94, 1996). 

In a preferred embodiment, the prostate cancer proteins, antibodies, nucleic 
acids, modified proteins and cells containing the native or modified prostate cancer proteins 
are used in screening assays. That is, the present invention provides novel methods for 

10 screening for compositions which modulate the prostate cancer phenotype or an identified 
physiological function of a prostate cancer protein. As above, this can be done on an 
individual gene level or by evaluating the effect of drug candidates on a "gene expression 
profile". In a preferred embodiment, the expression profiles are used, preferably in 
conjunction with high throughput screening techniques to allow monitoring for expression 

15 profile genes after treatment with a candidate agent, see Zlokarnik, supra. 

Having identified the differentially expressed genes herein, a variety of assays 
may be executed. In a preferred embodiment, assays may be run on an individual gene or 
protein level. That is, having identified a particular gene as up regulated in prostate cancer, 
test compounds can be screened for the ability to modulate gene expression or for binding to 

20 the prostate cancer protein. "Modulation" thus includes both an increase and a decrease in 
gene expression. The preferred amount of modulation will depend on the original change of 
the gene expression in normal versus tissue undergoing prostate cancer, with changes of at 
least 10%, preferably 50%, more preferably 100-300%, and in some embodiments 300- 
1000% or greater. Thus, if a gene exhibits a 4-fold increase in prostate cancer tissue 

25 compared to normal tissue, a decrease of about four-fold is often desired; similarly, a 10-fold 
decrease in prostate cancer tissue compared to normal tissue often provides a target value of a 
10-fold increase in expression to be induced by the test compound. 

The amount of gene expression may be monitored using nucleic acid probes 
and the quantification of gene expression levels, or, alternatively, the gene product itself can 

30 be monitored, e.g. , through the use of antibodies to the prostate cancer protein and standard 



62 



WO 02/30268 



PCT/US01/32045 



immunoassays, Proteomics and separation techniques may also allow quantification of 
expression. . 

In a preferred embodiment, gene expression or protein monitoring of a number 
of entities, i.e., an expression profile, is monitored simultaneously. Such profiles will 
5 typically involve a plurality of those entities described herein.. 

In this embodiment, the prostate cancer nucleic acid probes are attached to 
biochips as outlined herein for the detection and quantification of prostate cancer sequences 
in a particular cell. Alternatively, PCR may be used. Thus, a series, e.g., of microtiter plate, 
may be used with dispensed primers in desired wells. A PCR reaction can then be performed 
10 and analyzed for each well. 

Expression monitoring can be performed to identify compounds that modify 
the expression of one or more prostate cancer-associated sequences, e.g., a polynucleotide 
sequence set out in Tables 1-16. Generally, in a preferred embodiment, a test modulator is 
added to the cells prior to analysis. Moreover, screens are also provided to identify agents 
15 that modulate prostate cancer, modulate prostate cancer proteins, bind to a prostate cancer 
protein, or interfere with the binding of a prostate cancer protein and an antibody or other 
binding partner. 

The term "test compound" or "drug candidate" or "modulator" or grammatical 
equivalents as used herein describes any molecule, e.g., protein, oligopeptide, small organic 

20 molecule, polysaccharide, polynucleotide, etc., to be tested for the capacity to directly or 
indirectly alter the prostate cancer phenotype or the expression of a prostate cancer sequence, 
e.g., a nucleic acid or protein sequence. In preferred embodiments, modulators alter 
expression profiles, or expression profile nucleic acids or proteins provided herein. In one 
embodiment, the modulator suppresses a prostate cancer phenotype, e.g. to a normal tissue 

25 fingerprint In another embodiment, a modulator induced a prostate cancer phenotype. 

Generally, a plurality of assay mixtures are run in parallel with different agent concentrations 
to obtain a differential response to the various concentrations. Typically, one of these 
concentrations serves as a negative control, i.e., at zero concentration or below the level of 
detection. 

30 Drug candidates encompass numerous chemical classes, though typically they 

are organic molecules, preferably small organic compounds having a molecular weight of 
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more than 100 and less than about 2,500 daltons. Preferred small molecules are less than 
2000, or less than 1500 or less than 1000 or less than 500 D. Candidate agents comprise 
functional groups necessary for structural interaction with proteins, particularly hydrogen 
bonding, and typically include at least an amine, carbonyl, hydroxyl or carboxyl group, 
5 preferably at least two of the functional chemical groups. The candidate agents often 
comprise cyclical carbon or heterocyclic structures and/or aromatic or polyaromatic 
structures substituted with one or more of the above functional groups. Candidate agents are 
also found among biomolecules including peptides, saccharides, fatty acids, steroids, purines, 
pyrimidines, derivatives, structural analogs or combinations thereof. Particularly preferred 
10 are peptides. 

In one aspect, a modulator will neutralize the effect of a prostate cancer 
protein. By "neutralize" is meant that activity of a protein is inhibited or blocked and the 
consequent effect on the cell. 

In certain embodiments, combinatorial libraries of potential modulators will be 
15 screened for an ability to bind to a prostate cancer polypeptide or to modulate activity. 

Conventionally, new chemical entities with useful properties are generated by identifying a 
chemical compound (called a "lead compound") with some desirable property or activity, 
e.g., inhibiting activity, creating variants of the lead compound, and evaluating the property 
and activity of those variant compounds. Often, high throughput screening (HTS) methods 
20 are employed for such an analysis. 

In one preferred embodiment, high throughput screening methods involve 
providing a library containing a large number of potential therapeutic compounds (candidate 
compounds). Such "combinatorial chemical libraries" are then screened in one or more 
assays to identify those library members (particular chemical species or subclasses) that 
25 display a desired characteristic activity. The compounds thus identified can serve as 

conventional "lead compounds" or can themselves be used as potential or actual therapeutics. 

A combinatorial chemical library is a collection of diverse chemical 
compounds generated by either chemical synthesis or biological synthesis by combining a 
number of chemical "building blocks" such as reagents. For example, a linear combinatorial 
30 chemical library, such as a polypeptide (e.g., mutein) library, is formed by combining a set of 
chemical building blocks called amino acids in every possible way for a given compound 
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length (i.e., the number of amino acids in a polypeptide compound). Millions of chemical 
compounds can be synthesized through such combinatorial mixing of chemical building 
blocks (Gallop etai, J. Med. Chem. 37(9):1233-1251 (1994)). 

Preparation and screening of combinatorial chemical libraries is well known to 
5 those of skill in the art. Such combinatorial chemical libraries include, but are not limited to, 
peptide libraries {see, e.g., U.S. Patent No. 5,010,175, Furka, Pept. Prot. Res. 37:487-493 
(1991), Houghton et aL f Nature, 354:84-88 (1991)), peptoids (PCT Publication No WO 
91/19735), encoded peptides (PCT Publication WO 93/20242), random bio-oligomers (PCT 
Publication WO 92/00091), benzodiazepines (U.S. Pat. No. 5,288,514), diversomers such as 

10 hydantoins, benzodiazepines and dipeptides (Hobbs et al., Proc. Nat. Acad. Sci. USA 
90:6909-6913 (1993)), vinylogous polypeptides (Hagihara et al, /. Amer. Chem. Soc. 
114:6568 (1992)), nonpeptidal peptidomimetics with a Beta-D-Glucose scaffolding 
(Hirschmaim etal, J. Amer. Chem. Soc. 114:9217-9218 (1992)), analogous organic syntheses 
of small compound libraries (Chen et al., J. Amer. Chem. Soc. 116:2661 (1994)), 

15 oligocaibamates (Cho, et al., Science 261:1303 (1993)), and/or peptidyl phosphonates 

(Campbell et al, J. Org. Chem. 59:658 (1994)). See, generally, Gordon et al., J. Med. Chem. 
37:1385 (1994), nucleic acid libraries (see, e.g., Strategene, Corp.), peptide nucleic acid 
libraries (see, e.g., U.S. Patent 5,539,083), antibody libraries (see, e.g., Vaughn et al, Nature 
Biotechnology 14(3):309-314 (1996), and PCT/US96/10287), carbohydrate libraries (see, 

20 e.g., Iiang et al., Science 274:1520-1522 (1996), and U.S. Patent No. 5,593,853), and small 
organic molecule libraries (see, e.g., benzodiazepines, Baum, C&EN, Jan 18, page 33 (1993); 
isoprenoids, U.S. Patent No. 5,569,588; thiazolidinones and metathiazanones, U.S. Patent No. 
5,549,974; pyrrolidines, U.S. Patent Nos. 5,525,735 and 5,519,134; morpholino compounds, 
U.S. Patent No. 5,506,337; benzodiazepines, U.S. Patent No. 5,288,514; and the like). 

25 Devices for the preparation of combinatorial libraries are commercially 

available (see, e.g., 357 MPS, 390 MPS, Advanced Chem Tech, Louisville KY, Symphony, 
Rainin, Woburn, MA, 433A Applied Biosystems, Foster City, CA, 9050 Plus, Millipore, 
Bedford, MA). 

A number of well known robotic systems have also been developed for 
30 solution phase chemistries. These systems include automated workstations like the 

automated synthesis apparatus developed by Takeda Chemical Industries, LTD. (Osaka, 
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Japan) and many robotic systems utilizing robotic arms (Zymate n, Zymark Corporation, 
Hopkinton, Mass.; Orca, Hewlett-Piackard, Palo Alto, Calif.), which mimic the manual 
synthetic operations performed by a chemist. Any of the above devices are suitable for use 
with the present invention. The nature and implementation of modifications to these devices 
5 (if any) so that they can operate as discussed herein will be apparent to persons skilled in the 
relevant art. In addition, numerous combinatorial libraries are themselves commercially 
available (see, e.g., ComGenex, Princeton, N.J., Asinex, Moscow, Ru, Tripos, Inc., St. Louis, 
MO, ChemStar, Ltd, Moscow, RU, 3D Pharmaceuticals, Exton, PA, Martek Biosciences, 
Columbia, MD, etc.). 

10 The assays to identify modulators are amenable to high throughput screening. 

Preferred assays thus detect enhancement or inhibition of prostate cancer gene transcription, 
inhibition or enhancement of polypeptide expression, and inhibition or enhancement of 
polypeptide activity. 

High throughput assays for the presence, absence, quantification, or other 

15 properties of particular nucleic acids or protein products are well known to those of skill in 
the art. Similarly, binding assays and reporter gene assays are similarly well known. Thus, 
e.g., U.S. Patent No. 5,559,410 discloses high throughput screening methods for proteins, 
U.S. Patent No. 5,585,639 discloses high throughput screening methods for nucleic acid 
binding (i.e., in arrays), while US. Patent Nos. 5,576,220 and 5,541,061 disclose high 

20 throughput methods of screening for ligand/antibody binding. 

In addition, high throughput screening systems are commercially available 
(see, e.g., Zymark Corp., Hopkinton, MA; Air Technical Industries, Mentor, OH; Beckman 
Instruments, Inc. Fullerton, CA; Precision Systems, Inc., Natick, MA, etc.). These systems 
typically automate entire procedures, including all sample and reagent pipetting, liquid 

25 dispensing, timed incubations, and final readings of the microplate in detector(s) appropriate 
for the assay. These configurable systems provide high throughput and rapid start up as well 
as a high degree of flexibility and customization. The manufacturers of such systems provide 
detailed protocols for various high throughput systems. Thus, e.g., Zymark Corp. provides 
technical bulletins describing screening systems for detecting the modulation of gene 

30 transcription, ligand binding, and the like. 
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In one embodiment, modulators are proteins, often naturally occurring 
proteins or fragments of naturally occurring proteins. Thus, e.g., cellular extracts containing 
proteins, or random or directed digests of proteinaceous cellular extracts, may be used. In 
this way libraries of proteins may be made for screening in the methods of the invention. 

5 Particularly preferred in this embodiment are libraries of bacterial, fungal, viral, and 
mammalian proteins, with the latter being preferred, and human proteins being especially 
preferred. Particularly useful test compound will be directed to the class of proteins to which 
the target belongs, e.g., substrates for enzymes or ligands and receptors. 

In a preferred embodiment, modulators are peptides of from about 5 to about 

10 30 amino acids, with from about 5 to about 20 amino acids being preferred, and from about 7 
to about 15 being particularly preferred The peptides may be digests of naturally occurring 
proteins as is outlined above, random peptides, or "biased" random peptides. By 
"randomized" or grammatical equivalents herein is meant that each nucleic acid and peptide 
consists of essentially random nucleotides and amino acids, respectively. Since generally 

15 these random peptides (or nucleic acids, discussed below) are chemically synthesized, they 
may incorporate any nucleotide or amino acid at any position. The synthetic process can be 
designed to generate randomized proteins or nucleic acids, to allow the formation of all or 
most of the possible combinations over the length of the sequence, thus forming a library of 
randomized candidate bioactive proteinaceous agents. 

20 In one embodiment, the library is fully randomized, with no sequence 

preferences or constants at any position. In a preferred embodiment, the library is biased. 
That is, some positions within the sequence are either held constant, or are selected from a 
limited number of possibilities. For example, in a preferred embodiment, the nucleotides or 
amino acid residues are randomized within a defined class, e.g., of hydrophobic amino acids, 

25 hydrophilic residues, sterically biased (either small or large) residues, towards the creation of 
nucleic acid binding domains, the creation of cysteines, for cross-linking, prolines for SH-3 
domains, serines, threonines, tyrosines or histidines for phosphorylation sites, etc., or to 
purines, etc. 

Modulators of prostate cancer can also be nucleic acids, as defined below. As 
30 described above generally for proteins, nucleic acid modulating agents may be naturally 
occurring nucleic acids, random nucleic acids, or "biased" random nucleic acids. For 
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example, digests of procaryotic or eucaryotic genomes may be used as is outlined above for 
proteins. 

In certain embodiments, the activity of a prostate cancer-associated protein is 
down-regulated, or entirely inhibited, by the use of antisense polynucleotide, Le., a nucleic 
5 acid complementary to, and which can preferably hybridize specifically to, a coding mRNA 
nucleic acid sequence, e.g., a prostate cancer protein mRNA, or a subsequence thereof. 
Binding of the antisense polynucleotide to the mRNA reduces the translation and/or stability 
of the mRNA. 

In the context of this invention, antisense polynucleotides can comprise 

10 naturally-occurring nucleotides, or synthetic species formed from naturally-occurring 
subunits or their close homologs. Antisense polynucleotides may also have altered sugar 
moieties or inter-sugar linkages. Exemplary among these are the phosphorothioate and other 
sulfur containing species which are known for use in the art. Analogs are comprehended by 
this invention so long as they function effectively to hybridize with the prostate cancer 

15 protein mRNA. See, e.g., Isis Pharmaceuticals, Carlsbad, CA; Sequitor, Inc., Natick, MA. 

Such antisense polynucleotides can readily be synthesized using recombinant 
means, or can be synthesized in vitro. Equipment for such synthesis is sold by several 
vendors, including Applied Biosystems. The preparation of other oligonucleotides such as 
phosphorothioates and alkylated derivatives is also well known to those of skill in the art 

20 Antisense molecules as used herein include antisense or sense 

oligonucleotides. Sense oligonucleotides can, e.g., be employed to block transcription by 
binding to the anti-sense strand. The antisense and sense oligonucleotide comprise a single- 
stranded nucleic acid sequence (either RNA or DNA) capable of binding to target mRNA 
(sense) or DNA (antisense) sequences for prostate cancer molecules. Antisense or sense 

25 oligonucleotides, according to the present invention, comprise a fragment generally at least 
about 14 nucleotides, preferably from about 14 to 30 nucleotides. The ability to derive an 
antisense or a sense oligonucleotide, based upon a cDNA sequence encoding a given protein 
is described in, e.g., Stein & Cohen (Cancer Res. 48:2659 (1988 and van der Krol et al 
(BioTechniques 6:958 (1988)). 

30 In addition to antisense polynucleotides, ribozymes can be used to target and 

inhibit transcription of prostate cancer-associated nucleotide sequences. A ribozyme is an 
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RNA molecule that catalytically cleaves other RNA molecules. Different kinds of ribozymes 
have been described, including group I ribozymes, hammerhead ribozymes, hairpin 
ribozymes, RNase P, and axhead ribozymes {see, e.g., Castanotto et al.,Adv. in 
Pharmacology 25: 289-317 (1994) for a general review of the properties of different 
5 ribozymes). 

The general features of hairpin ribozymes are described, e.g., in Hampel et al, 
NucL Acids Res, 18:299-304 (1990); European Patent Publication No. 0 360 257; U.S. Patent 
No. 5,254,678. Methods of preparing are well known to those of skill in the art (see, e.g., 
WO 94/26877; Ojwang et aL, Proc. Natl Acad. ScL USA 90:6340-6344 (1993); Yamada et 

10 al., Human Gene Therapy 1:39-45 (1994); Leavitt et al, Proc. Natl Acad. Sci. USA 92:699- 
703 (1995); Leavitt et al., Human Gene Therapy 5:1151-120 (1994); and Yamada et al., 
Virology 205: 121-126 (1994)). 

Polynucleotide modulators of prostate cancer may be introduced into a cell 
containing the target nucleotide sequence by formation of a conjugate with a ligand binding 

15 molecule, as described in WO 91/04753. Suitable ligand binding molecules include, but are 
not limited to, cell surface receptors, growth factors, other cytokines, or other ligands that 
bind to cell surface receptors. Preferably, conjugation of the ligand binding molecule does 
not substantially interfere with the ability of the ligand binding molecule to bind to its 
corresponding molecule or receptor, or block entry of the sense or antisense oligonucleotide 

20 or its conjugated version into the cell. Alternatively, a polynucleotide modulator of prostate 
cancer may be introduced into a cell containing the target nucleic acid sequence, e.g., by 
formation of an polynucleotide-lipid complex, as described in WO 90/10448. It is 
understood that the use of antisense molecules or knock out and knock in models may also be 
used in screening assays as discussed above, in additioir to methods of treatment 

25 As noted above, gene expression monitoring is conveniently used to test 

candidate modultors (e.g., protein, nucleic acid or small molecule). After the candidate agent 
has been added and the cells allowed to incubate for some period of time, the sample 
containing a target sequence to be analyzed is added to the biochip. If required, the target 
sequence is prepared using known techniques. For example, the sample may be treated to 

30 lyse the cells, using known lysis buffers, electroporation, etc., with purification and/or 

amplification such as PCR performed as appropriate. For example, an in vitro transcription 
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with labels covalently attached to the nucleotides is performed. Generally, the nucleic acids 
are labeled with biotin-FTTC or PE, or with cy3 or cy5. 

In a preferred embodiment, the target sequence is labeled with, e.g., a 
fluorescent, a chemiluminescent, a chemical, or a radioactive signal, to provide a means of 
5 detecting the target sequence's specific binding to a probe. The label also can be an enzyme, 
such as, alkaline phosphatase or horseradish peroxidase, which when provided with an 
appropriate substrate produces a product that can be detected. Alternatively, the label can be 
a labeled compound or small molecule, such as an enzyme inhibitor, that binds but is not 
catalyzed or altered by the enzyme. The label also can be a moiety or compound, such as, an 

10 epitope tag or biotin which specifically binds to streptavidin. For the example of biotin, the 
streptavidin is labeled as described above, thereby, providing a detectable signal for the 
bound target sequence. Unbound labeled streptavidin is typically removed prior to analysis. 

As will be appreciated by those in the art, these assays can be direct 
hybridization assays or can comprise "sandwich assays", which include the use of multiple 

15 probes, as is generally outlined in U.S. Patent Nos. 5,681,702, 5,597,909, 5,545,730, 

5,594,117, 5,591,584, 5,571,670, 5,580,731, 5,571,670, 5,591,584, 5,624,802, 5,635,352, 
5,594,118, 5,359,100, 5,124,246 and 5,681,697, all of which are hereby incorporated by 
reference. In this embodiment, in general, the target nucleic acid is prepared as outlined 
above, and then added to the biochip comprising a plurality of nucleic acid probes, under 

20 conditions that allow the formation of a hybridization complex. 

A variety of hybridization conditions may be used in the present invention, 
including high, moderate and low stringency conditions as outlined above. The assays are 
generally run under stringency conditions which allows formation of the label probe 
hybridization complex only in the presence of target. Stringency can be controlled by 

25 altering a step parameter that is a thermodynamic variable, including, but not limited to, 
temperature, formamide concentration, salt concentration, chaotropic salt concentration pH, 
organic solvent concentration, etc. 

These parameters may also be used to control non-specific binding, as is 
generally outlined in U.S. Patent No. 5,681,697. Thus it may be desirable to perform certain 

30 steps at higher stringency conditions to reduce non-specific binding. 
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The reactions outlined herein may be accomplished in a variety of ways. 
Components of the reaction may be added simultaneously, or sequentially, in different orders, 
with preferred embodiments outlined below. In addition, the reaction may include a variety 
of other reagents. These include salts, buffers, neutral proteins, e.g. albumin, detergents, etc. 
5 which may be used to facilitate optimal hybridization and detection, and/or reduce non- 
specific or background interactions. Reagents that otherwise improve the efficiency of the 
assay, such as protease inhibitors, nuclease inhibitors, anti-microbial agents, etc., may also be 
used as appropriate, depending on the sample preparation methods and purity of the target 

The assay data are analyzed to determine the expression levels, and changes in 

10 expression levels as between states, of individual genes, forming a gene expression profile. 

Screens are performed to identify modulators of the prostate cancer 
phenotype. In one embodiment, screening is performed to identify modulators that can 
induce or suppress a particular expression profile, thus preferably generating the associated 
phenotype. In another embodiment, e.g., for diagnostic applications, having identified 

15 differentially expressed genes important in a particular state, screens can be performed to 
identify modulators that alter expression of individual genes. In an another embodiment, 
screening is performed to identify modulators that alter a biological function of the 
expression product of a differentially expressed gene. Again, having identified the 
importance of a gene in a particular state, screens are performed to identify agents that bind 

20 and/or modulate the biological activity of the gene product. 

In addition screens can be done for genes that are induced in response to a 
candidate agent. After identifying a modulator based upon its ability to suppress a prostate 
cancer expression pattern leading to a normal expression pattern, or to modulate a single 
prostate cancer gene expression profile so as to mimic the expression of the gene from 

25 normal tissue, a screen as described above can be performed to identify genes that are 
specifically modulated in response to the agent. Comparing expression profiles between 
normal tissue and agent treated prostate cancer tissue reveals genes that are not expressed in 
normal tissue or prostate cancer tissue, but are expressed in agent treated tissue. These agent- 
specific sequences can be identified and used by methods described herein for prostate cancer 

30 genes or proteins. In particular these sequences and the proteins they encode find use in 
marking or identifying agent treated cells. In addition, antibodies can be raised against the 
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agent induced proteins and used to target novel therapeutics to the treated prostate cancer 
tissue sample. 

Thus, in one embodiment, a test compound is administered to a population of 
prostate cancer cells, that have ah associated prostate cancer expression profile. By 

5 "administration" or "contacting" herein is meant that the candidate agent is added to the cells 
in such a manner as to allow the agent to act upon the cell, whether by uptake and 
intracellular action, or by action at the cell surface. In some embodiments, nucleic acid 
encoding a proteinaceous candidate agent (i.e., a peptide) may be put into a viral construct 
such as an adenoviral or retroviral construct, and added to the cell, such that expression of 

10 the peptide agent is accomplished, e.g., PCT US97/01019. Regulatable gene therapy systems 
can also be used. 

Once the test compound has been administered to the cells, the cells can be 
washed if desired and are allowed to incubate under preferably physiological conditions for 
some period of time. The cells are then harvested and a new gene expression profile is 

15 generated, as outlined herein. 

. Thus, e.g„ prostate cancer tissue may be screened for agents that modulate, 
e.g., induce or suppress the prostate cancer phenotype. A change in at least one gene, 
preferably many, of the expression profile indicates that the agent has an effect on prostate 
cancer activity. By defining such a signature for the prostate cancer phenotype, screens for 

20 new drugs that alter the phenotype can be devised. With this approach, the drug target need 
not be known and need not be represented in the original expression screening platform, nor 
does the level of transcript for the target protein need to change. 

In a preferred embodiment, as outlined above, screens may be done on 
individual genes and gene products (proteins). That is, having identified a particular 

25 * differentially expressed gene as important in a particular state, screening of modulators of 
either the expression of the gene or the gene product itself can be done. The gene products of 
differentially expressed genes are sometimes referred to herein as "prostate cancer proteins" 
or a "prostate cancer modulatory protein". The prostate cancer modulatory protein may be a 
fragment, or alternatively, be the full length protein to the fragment encoded by the nucleic 

30 acids of Tables 1-16. Preferably, the prostate cancer modulatory protein is a fragment. In a 
preferred embodiment, the prostate cancer amino acid sequence which is used to determine 
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sequence identity or similarity is encoded by a nucleic acid of Tables 1-16. In another 
embodiment, the sequences are naturally occurring allelic variants of a protein encoded by a 
nucleic acid of Tables 1-16. In another embodiment, the sequences are sequence variants as 
further described herein. 
5 Preferably, the prostate cancer modulatory protein is a fragment of 

approximately 14 to 24 amino acids long. More preferably the fragment is a soluble 
fragment. Preferably, the fragment includes a non-transmembrane region. In a preferred 
embodiment, the fragment has an N-terminal Cys to aid in solubility. In one embodiment, the 
C-terminus of the fragment is kept as a free acid and the N-terminus is a free amine to aid in 

10 coupling, i.e., to cysteine. 

In one embodiment the prostate cancer proteins are conjugated to an 
immunogenic agent as discussed herein. In one embodiment the prostate cancer protein is 
conjugated to BSA. 

Measurements of prostate cancer polypeptide activity, or of prostate cancer or 

15 the prostate cancer phenotype can be performed using a variety of assays. For example, the 
effects of the test compounds upon the function of the prostate cancer polypeptides can be 
measured by examining parameters described above. A suitable physiological change that 
affects activity can be used to assess the influence of a test compound on the polypeptides of 
this invention. When the functional consequences are determined using intact cells or 

20 animals, one can also measure a variety of effects such as, in the case of prostate cancer 
associated with tumors, tumor growth, tumor metastasis, neovascularization, hormone 
release, transcriptional changes to both known and uncharacterized genetic markers (e.g., 
northern blots), changes in cell metabolism such as cell growth or pH changes, and changes 
in intracellular second messengers such as cGMP. In the assays of the invention, mammalian 

25 prostate cancer polypeptide is typically used, e.g., mouse, preferably human. 

Assays to identify compounds with modulating activity can be performed in 
vitro. For example, a prostate cancer polypeptide is first contacted with a potential modulator 
and incubated for a suitable amount of time, e.g., from 0.5 to 48 hours. In one embodiment, 
the prostate cancer polypeptide levels are determined in vitro by measuring the level of 

30 protein or mRNA. The level of protein is measured using immunoassays such as western 
blotting, ELISA and the like with an antibody that selectively binds to the prostate cancer 
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polypeptide or a fragment thereof. For measurement of mRNA, amplification, e.g., using 
PCR, LCR, or hybridization assays, e.g., northern hybridization, RNAse protection, dot 
blotting, are preferred. The level of protein or mRNA is detected using directly or indirectly 
labeled detection agents, e.g., fluorescently or radioactively labeled nucleic acids, 
5 radioactively or enzymatically labeled antibodies, and the like, as described herein. 

Alternatively, a reporter gene system can be devised using the prostate cancer 
protein promoter operably linked to a reporter gene such as luciferase, green fluorescent 
protein, CAT, or (J-gal. The reporter construct is typically transfected into a cell. After 
treatment with a potential modulator, the amount of reporter gene transcription, translation, or 

10 activity is measured according to standard techniques known to those of skill in the art. 

.=. 

In a preferred embodiment, as outlined above, screens may be done on 
individual genes and gene products (proteins). That is, having identified a particular 
differentially expressed gene as important in a particular state, screening of modulators of the 
expression of the gene or the gene product itself can be done. The gene products of 
15 differentially expressed genes are sometimes referred to herein as "prostate cancer proteins." 
The prostate cancer protein may be a fragment, or alternatively, be the full length protein to a 
fragment shown herein. 

In one embodiment, screening for modulators of expression of specific genes 
is performed. Typically, the expression of only one or a few genes are evaluated. In another 
20 embodiment, screens are designed to first find compounds that bind to differentially 
expressed proteins. These compounds are then evaluated for the ability to modulate 
differentially expressed activity. Moreover, once initial candidate compounds are identified, 
variants can be further screened to better evaluate structure activity relationships. 

In a preferred embodiment, binding assays are done. In general, purified or 
25 isolated gene product is used; that is, the gene products of one or more differentially 

expressed nucleic acids are made. For example, antibodies are generated to the protein gene 
products, and standard immunoassays are run to determine the amount of protein present. 
Alternatively, cells comprising the prostate cancer proteins can be used in the assays. 

Thus, in a preferred embodiment, the methods comprise combining a prostate 
30 cancer protein and a candidate compound, and determining the binding of the compound to 
the prostate cancer protein. Preferred embodiments utilize the human prostate cancer protein, 
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although other mammalian proteins may also be used, e.g. for the development of animal 
models of human disease. In some embodiments, as outlined herein, variant or derivative 
prostate cancer proteins may be used. 

Generally, in a preferred embodiment of the methods herein, the prostate 
5 cancer protein or the candidate agent is non-diffusably bound to an insoluble support having 
isolated sample receiving areas (e.g. a microtiter plate, an array, etc.). The insoluble 
supports may be made of any composition to which the compositions can be bound, is readily 
separated from soluble material, and is otherwise compatible with the overall method of 
screening. The surface of such supports may be solid or porous and of any convenient shape. 

10 Examples of suitable insoluble supports include microtiter plates, arrays, membranes and 
beads. These are typically made of glass, plastic (e.g., polystyrene), polysaccharides, nylon 
or nitrocellulose, teflon™, etc. Microtiter plates and arrays are especially convenient because 
a large number of assays can be carried out simultaneously, using small amounts of reagents 
and samples. The particular manner of binding of the composition is not crucial so long as it 

15 is compatible with the reagents and overall methods of the invention, maintains the activity of 
the composition and is nondiff usable. Preferred methods of binding include the use of 
antibodies (which do not sterically block either the ligand binding site or activation sequence 
when the protein is bound to the support), direct binding to "sticky" or ionic supports, 
chemical crosslinking, the synthesis of the protein or agent on the surface, etc. Following 

20 binding of the protein or agent, excess unbound material is removed by washing. The sample 
receiving areas may then be blocked through incubation with bovine serum albumin (BSA), 
casein or other innocuous protein or other moiety. 

In a preferred embodiment, the prostate cancer protein is bound to the support, 
and a test compound is added to the assay. Alternatively, the candidate agent is bound to the 

25 support and the prostate cancer protein is added Novel binding agents include specific 
antibodies, non-natural binding agents identified in screens of chemical libraries, peptide 
analogs, etc. Of particular interest are screening assays for agents that have a low toxicity for 
human cells. A wide variety of assays may be used for this purpose, including labeled in 
vitro protein-protein binding assays, electrophoretic mobility shift assays, immunoassays for 

30 protein binding, functional assays (phosphorylation assays, etc.) and the like. 
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The determination of the binding of the test modulating compound to the 
prostate cancer protein may be done in a number of ways. In a preferred embodiment, the 
compound is labeled, and binding determined directly, e.g., by attaching all or a portion of 
the prostate cancer protein to a solid support, adding a labeled candidate agent (e.g., a 
5 fluorescent label), washing off excess reagent, and determining whether the label is present 
on the solid support. Various blocking and washing steps may be utilized as appropriate. 

In some embodiments, only one of the components is labeled, e.g., the 
proteins (or proteinaceous candidate compounds) can be labeled. Alternatively, more than 
one component can be labeled with different labels, e.g., 125 I for the proteins and a fluorophor 
10 for the compound. Proximity reagents, e.g., quenching or energy transfer reagents are also 
useful. 

In one embodiment, the binding of the test compound is determined by 
competitive binding assay. The competitor is a binding moiety known to bind to the target 
molecule (i.e., a prostate cancer protein), such as an antibody, peptide, binding partner, 

15 ligand, etc. Under certain circumstances, there may be competitive binding between the 

compound and the binding moiety, with the binding moiety displacing the compound. In one 
embodiment, the test compound is labeled. Either the compound, or the competitor, or both, 
is added first to the protein for a time sufficient to allow binding, if present. Incubations may 
be performed at a temperature which facilitates optimal activity, typically between 4 and 

20 40°C. Incubation periods are typically optimized, e.g., to facilitate rapid high throughput 
screening. Typically between 0.1 and 1 hour will be sufficient. Excess reagent is generally 
removed or washed away. The second component is then added, and the presence or absence 
of the labeled component is followed, to indicate binding. 

In a preferred embodiment, the competitor is added first, followed by the test 

25 compound. Displacement of the competitor is an indication that the test compound is binding 
to the prostate cancer protein and thus is capable of binding to, and potentially modulating, 
the activity of the prostate cancer protein. In this embodiment, either component can be 
labeled. Thus, e.g., if the competitor is labeled, the presence of label in the wash solution 
indicates displacement by the agent Alternatively, if the test compound is labeled, the 

30 presence of the label on the support indicates displacement. 
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In an alternative embodiment, the test compound is added first, with 
incubation and washing, followed by the competitor. The absence of binding by the 
competitor may indicate that the test compound is bound to the prostate cancer protein with a 
higher affinity. Thus, if the test compound is labeled, the presence of the label on the 
5 support, coupled with a lack of competitor binding, may indicate that the test compound is 
capable of binding to the prostate cancer protein. 

In a preferred embodiment, the methods comprise differential screening to 
identity agents that are capable of modulating the activity of the prostate cancer proteins. In 
this embodiment, the methods comprise combining a prostate cancer protein and a competitor 
10 in a first sample. A second sample comprises a test compound, a prostate cancer protein, and 
a competitor. The binding of the competitor is determined for both samples, and a change, or 
difference in binding between the two samples indicates the presence of an agent capable of 
binding to the prostate cancer protein and potentially modulating its activity. That is, if the 
binding of the competitor is different in the second sample relative to the first sample, the 
15 agent is capable of binding to the prostate cancer protein. 

Alternatively, differential screening is used to identify drug candidates that 
bind to the native prostate cancer protein, but cannot bind to modified prostate cancer 
proteins. The structure of the prostate cancer protein may be modeled, and used in rational 
drug design to synthesize agents that interact with that site. Drug candidates that affect the 
20 activity of a prostate cancer protein are also identified by screening drugs for the ability to 
either enhance or reduce the activity of the protein. 

Positive controls and negative controls may be used in the assays. Preferably 
control and test samples are performed in at least triplicate to obtain statistically significant 
results. Incubation of all samples is for a time sufficient for the binding of the agent to the 
25 protein. Following incubation, samples are washed free of non-specifically bound material 
and the amount of bound, generally labeled agent determined. For example, where a 
radiolabel is employed, the samples may be counted in a scintillation counter to determine the 
amount of bound compound. 

A variety of other reagents may be included in the screening assays. These 
30 include reagents like salts, neutral proteins, e.g. albumin, detergents, etc. which may be used 
to facilitate optimal protein-protein binding and/or reduce non-specific or background 
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interactions. Also reagents that otherwise improve the efficiency of the assay, such as 
protease inhibitors, nuclease inhibitors, anti-microbial agents, etc., may be used The mixture 
of components may be added in an order that provides for the requisite binding. 

In a preferred embodiment, the invention provides methods for screening for a 
5 compound capable of modulating the activity of a prostate cancer protein. The methods 
comprise adding a test compound, as defined above, to a cell comprising prostate cancer 
proteins. Preferred cell types include almost any cell. The cells contain a recombinant 
nucleic acid that encodes a prostate cancer protein. In a preferred embodiment, a library of 
candidate agents are tested on a plurality of cells. 

10 In one aspect, the assays are evaluated in the presence or absence or previous 

or subsequent exposure of physiological signals, e.g. hormones, antibodies, peptides, 
antigens, cytokines, growth factors, action potentials, pharmacological agents including 
chemotherapeutics, radiation, carcinogenics, or other cells (i.e. cell-cell contacts). In another 
example, the determinations are determined at different stages of the cell cycle process. 

15 In this way, compounds that modulate prostate cancer agents are identified. 

Compounds with pharmacological activity are able to enhance or interfere with the activity of 
the prostate cancer protein. Once identified, similar structures are evaluated to identify 
critical structural feature of the compound. 

In one embodiment, a method of inhibiting prostate cancer cell division is 

20 provided The method comprises administration of a prostate cancer inhibitor. In another 
embodiment, a method of inhibiting prostate cancer is provided The method comprises 
administration of a prostate cancer inhibitor. In a further embodiment, methods of treating 
cells or individuals with prostate cancer are provided. The method comprises administration 
of a prostate cancer inhibitor. 

25 In one embodiment, a prostate cancer inhibitor is an antibody as discussed 

above. In another embodiment, the prostate cancer inhibitor is an antisense molecule. 

A variety of cell growth, proliferation, and metastasis assays are known to 
those of skill in the art, as described below. 

Soft agar growth or colony formation in suspension 

30 Normal cells require a solid substrate to attach and grow. When the cells are 

transformed, they lose this phenotype and grow detached from the substrate. For example, 
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transformed cells can grow in stirred suspension culture or suspended in semi-solid media, 
such as semi-solid or soft agar. The transformed cells, when transfected with tumor 
suppressor genes, regenerate normal phenotype and require a solid substrate to attach and 
grow. Soft agar growth or colony formation in suspension assays can be used to identify 
5 modulators of prostate cancer sequences, which when expressed in host cells, inhibit 

abnormal cellular proliferation and transformation. A therapeutic compound would reduce or 
eliminate the host cells' ability to grow in stirred suspension culture or suspended in semi- 
solid media, such as semi-solid or soft 

Techniques for soft agar growth or colony formation in suspension assays are 
10 described in Freshney, Culture of Animal Cells a Manual of Basic Technique (3 rd ed., 1994), 
herein incorporated by reference. See also, the methods section of Garkavtsev et al (1996), 
supra* herein incorporated by reference. 

Contact inhibition and density limitation of growth 

Normal cells typically grow in a flat and organized pattern in a petri dish until 
15 they touch other cells. When the cells touch one another, they are contact inhibited and stop 
growing. When cells are transformed, however, the cells are not contact inhibited and 
continue to grow to high densities in disorganized foci. Thus, the transformed cells grow to a 
higher saturation density than normal cells. This can be detected morphologically by the 
formation of a disoriented monolayer of cells or rounded cells in foci within the regular 
20 pattern of normal surrounding cells. Alternatively, labeling index with ( 3 H)-thymidine at 
saturation density can be used to measure density limitation of growth. See Freshney (1994), 
supra. The transformed cells, when transfected with tumor suppressor genes, regenerate a 
normal phenotype and become contact inhibited and would grow to a lower density. 

In this assay, labeling index with ( 3 H)-thymidine at saturation density is a 
25 preferred method of measuring density limitation of growth. Transformed host cells are 
transfected with a prostate cancer-associated sequence and are grown for 24 hours at 
saturation density in non-limiting medium conditions. The percentage of cells labeling with 
( 3 H)-thymidine is determined autoradiographically. See, Freshney (1994), supra. 
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Growth factor or serum dependence 

Transformed cells have a lower serum dependence than their normal 
counterparts (see, e.g., Temin, /. Natl Cancer Instu 37:167-175 (1966); Eagle et al, J. Exp. 
Med 131:836-879 (1970)); Freshney, supra. This is in part due to release of various growth 
5 factors by the transformed cells. Growth factor or serum dependence of transformed host 
cells can be compared with that of control. 

Tumor specific markers levels 

Tumor cells release an increased amount of certain factors (hereinafter "tumor 
10 specific markers") than their normal counterparts. For example, plasminogen activator (PA) 
is released from human glioma at a higher level than from normal brain cells (see, e.g., 
Gullino, Angiogenesis, tumor vascularization, and potential interference with tumor growth. 
in Biological Responses in Cancer, pp. 178-184 (Mihich (ed.) 1985)). Similarly, Tumor 
angiogenesis factor (TAF) is released at a higher level in tumor cells than their normal 
15 counterparts. See, e.g., Folkman, Angiogenesis and Cancer, Sem Cancer Biol (1992)). 

Various techniques which measure the release of these factors are described in 
Freshney (1994), supra. Also, see, Unkless et al. , /. Biol Chem. 249:4295-4305 (1974); 
Strickland & Beers, J. Biol Chem. 251:5694-5702 (1976); Whur et al., Br. J. Cancer 42:305- 
312 (1980); Gullino, Angiogenesis, tumor vascularization, and potential interference with 
20 tumor growth, in Biological Responses in Cancer, pp. 178-184 (Mihich (ed.) 1985); 
Freshney Anticancer Res. 5:111-130 (1985). 

Invasiveness into Matrigel 

The degree of invasiveness into Matrigel-or some other extracellular matrix 
25 constituent can be used as an assay to identify compounds that modulate prostate cancer- 
associated sequences. Tumor cells exhibit a good correlation between malignancy and 
invasiveness of cells into Matrigel or some other extracellular matrix constituent. In this 
assay, tumorigenic cells are typically used as host cells. Expression of a tumor suppressor 
gene in these host cells would decrease invasiveness of the host cells. 
30 Techniques described in Freshney (1994), supra, can be used. Briefly, the 

level of invasion of host cells can be measured by using filters coated with Matrigel or some 
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other extracellular matrix constituent. Penetration into the gel, or through to the distal side of 
the filter, is rated as invasiveness, and rated histologically by number of cells and distance 
moved, or by prelabeling the cells with I25 I and counting the radioactivity on the distal side of 
the filter or bottom of the dish. See, e.g., Freshney (1984), supra. 

5 

Tumor growth in vivo 

Effects of prostate cancer-associated sequences on cell growth can be tested in 
transgenic or immune-suppressed mice. Knock-out transgenic mice can be made, in which 
the prostate cancer gene is disrupted or in which a prostate cancer gene is inserted. Knock- 

10 out transgenic mice can be made by insertion of a marker gene or other heterologous gene 
into the endogenous prostate cancer gene site in the mouse genome via homologous 
recombination. Such mice can also be made by substituting the endogenous prostate cancer 
gene with a mutated version of the prostate cancer gene, or by mutating the endogenous 
prostate cancer gene, e.g., by exposure to carcinogens. 

15 A DNA construct is introduced into the nuclei of embryonic stem cells. Cells 

containing the newly engineered genetic lesion are injected into a host mouse embryo, which 
is re-implanted into a recipient female. Some of these embryos develop into chimeric mice 
that possess germ cells partially derived from the mutant cell line. Therefore, by breeding the 
chimeric mice it is possible to obtain a new line of mice containing the introduced genetic 

20 lesion (see, e.g., Capecchi et al., Science 244:1288 (1989)). Chimeric targeted mice can be 
derived according to Hogan et al., Manipulating the Mouse Embryo: A Laboratory Manual, 
Cold Spring Harbor Laboratory (1988) and Teratocarcinomas and Embryonic Stem Cells: A 
Practical Approach, Robertson, ed., IRL Press, Washington, D.C., (1987). 

Alternatively, various immune-suppressed or immune-deficient host animals 

25 can be used. For example, genetically athymic "nude" mouse (see, e.g., Giovanella et al, J. 
Natl Cancer Inst 52:921 (1974)), a SCID mouse, a thymectomized mouse, or an irradiated 
mouse (see, e.g., Bradley et al, Br. J. Cancer 38:263 (1978); Selby et al., Br. J. Cancer 
41:52 (1980)) can be used as a host. Transplantable tumor cells (typically about 10 6 cells) 
injected into isogenic hosts will produce invasive tumors in a high proportions of cases, while 

30 normal cells of similar origin will not. In hosts which developed invasive tumors, cells 
expressing a prostate cancer-associated sequences are injected subcutaneously. After a 
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suitable length of time, preferably 4-8 weeks, tumor growth is measured (e.g., by volume or 
by its two largest dimensions) and compared to the control. Tumors that have statistically 
significant reduction (using, e.g., Student's T test) are said to have inhibited growth. 

S Methods of identifying variant prostate cancer-associated sequences 

Without being bound by theory, expression of various prostate cancer 
sequences is correlated with prostate cancer. Accordingly, disorders based on mutant or 
variant prostate cancer genes may be determined. In one embodiment, the invention provides 
methods for identifying cells containing variant prostate cancer genes, e.g., determining all or 

10 part of the sequence of at least one endogenous prostate cancer genes in a cell. This may be 
accomplished using any number of sequencing techniques. In a preferred embodiment, the 
invention provides methods of identifying the prostate cancer genotype of an individual, e.g., 
determining all or part of the sequence of at least one prostate cancer gene of the individual. 
This is generally done in at least one tissue of the individual, and may include the evaluation 

15 of a number of tissues or different samples of the same tissue. The method may include 
comparing the sequence of the sequenced prostate cancer gene to a known prostate cancer 
gene, i.e., a wild-type gene. ■ 

The sequence of all or part of the prostate cancer gene can then be compared 
to the sequence of a known prostate cancer gene to determine if any differences exist This 

20 can be done using any number of known homology programs, such as Bestfit, etc. In a 
preferred embodiment, the presence of a difference in the sequence between the prostate 
cancer gene of the patient and the known prostate cancer gene correlates with a disease state 
or a propensity for a disease state, as outlined herein. 

In a preferred embodiment, the prostate cancer genes are used as probes to 

25 determine the number of copies of the prostate cancer gene in the genome. 

In another preferred embodiment, the prostate cancer genes are used as probes 
to determine the chromosomal localization of the prostate cancer genes. Information such as 
chromosomal localization finds use in providing a diagnosis or prognosis in particular when 
chromosomal abnormalities such as translocations, and the like are identified in the prostate 

30 cancer gene locus. 
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Administration of pharmaceutical and vaccine compositions 

In one embodiment, a therapeutically effective dose of a prostate cancer 
protein or modulator thereof, is administered to a patient. By "therapeutically effective dose" 
herein is meant a dose that produces effects for which it is administered. The exact dose will 
5 depend on the purpose of the treatment, and will be ascertainable by one skilled in the art 
using known techniques (e.g., Ansel et al. , Pharmaceutical Dosage Forms and Drug 
Delivery; Lieberman, Pharmaceutical Dosage Forms (vols. 1-3, 1992), Dekker, ISBN 
0824770846, 082476918X, 0824712692, 0824716981; Lloyd, The Art, Science and 
Technology of Pharmaceutical Compounding (1999); and Pickar, Dosage Calculations 

10 (1999)). As is known in the art, adjustments for prostate cancer degradation, systemic versus 
localized delivery, and rate of new protease synthesis, as well as the age, body weight, 
general health, sex, diet, time of administration, drug interaction and the severity of the 
condition may be necessary, and will be ascertainable with routine experimentation by those 
skilled in the art. U.S. Patent Application N. 09/687,576, further discloses the use of 

15 compositions and methods of diagnosis and treatment in prostate cancer is hereby expressly 
incorporated by reference. 

A "patient" for the purposes of the present invention includes both humans 
and other animals, particularly mammals. Thus the methods are applicable to both human 
therapy and veterinary applications. In the preferred embodiment the patient is a mammal, 

20 preferably a primate, and in the most preferred embodiment the patient is human. 

The administration of the prostate cancer proteins and modulators thereof of 
the present invention can be done in a variety of ways as discussed above, including, but not 
limited to, orally, subcutaneously, intravenously, intranasally, transdermally, 
intraperitoneally, intramuscularly, intrapulmonary, vaginally, rectally, or intraocularly. In 

25 some instances, e.g., in the treatment of wounds and inflammation, the prostate cancer 
proteins and modulators may be directly applied as a solution or spray. 

The pharmaceutical compositions of the present invention comprise a prostate 
cancer protein in a form suitable for administration to a patient. In the preferred embodiment, 
the pharmaceutical compositions are in a water soluble form, such as being present as 

30 pharmaceutically acceptable salts, which is meant to include both acid and base addition 
salts. "Pharmaceutically acceptable acid addition salt" refers to those salts that retain the 
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biological effectiveness of the free bases and that are not biologically or otherwise 
undesirable, formed with inorganic acids such as hydrochloric acid, hydrobromic acid, 
sulfuric acid, nitric acid, phosphoric acid and the like, and organic acids such as acetic acid, 
propionic acid, glycolic acid, pyruvic acid, oxalic acid, maleic acid, malonic acid, succinic 
5 acid, fumade acid, tartaric acid, citric acid, benzoic acid, cinnamic acid, mandelic acid, 
methanesulfonic acid, ethanesulfonic acid, p-toluenesulfonic acid, salicylic acid and the like. 
"Pharmaceutically acceptable base addition salts" include those derived from inorganic bases 
'such as sodium, potassium, lithium, ammonium, calcium, magnesium, iron, zinc, copper, 
manganese, aluminum salts and the like. Particularly preferred are the ammonium, 

10 potassium, sodium, calcium, and magnesium salts. Salts derived from pharmaceutically 
acceptable organic non-toxic bases include salts of primary, secondary, and tertiary amines, 
a substituted amines including naturally occurring substituted amines, cyclic amines and basic 
ion exchange resins, such as isopropylamine, trimethylamine, diethylamine, triethylamine, 
tripropylamine, and ethanolamine. 

15 The pharmaceutical compositions may also include one or more of the 

following: carrier proteins such as serum albumin; buffers; fillers such as microcrystalline 
cellulose, lactose, corn and other starches; binding agents; sweeteners and other flavoring 
agents; coloring agents; and polyethylene glycol. 

Hie pharmaceutical compositions can be administered in a variety of unit 

20 dosage forms depending upon the method of administration. For example, unit dosage forms 
suitable for oral administration include, but are not limited to, powder, tablets, pills, capsules 
and lozenges. It is recognized that prostate cancer protein modulators (e.g., antibodies, 
antisense constructs, ribozymes, small organic molecules, etc.) when administered orally, 
should be protected from digestion. This is typically accomplished either by complexing the 

25 molecule(s) with a composition to render it resistant to acidic and enzymatic hydrolysis, or by 
packaging the molecule(s) in an appropriately resistant carrier, such as a liposome or a 
protection barrier. Means of protecting agents from digestion are well known in the art. 

The compositions for administration will commonly comprise a prostate 
cancer protein modulator dissolved in a pharmaceutically acceptable carrier, preferably an 

30 aqueous carrier. A variety of aqueous carriers can be used, e.g., buffered saline and the like. 
These solutions are sterile and generally free of undesirable matter. These compositions may 

84 



WO 02/30268 



PCT/US01/32045 



be sterilized by conventional, well known sterilization techniques. Hie compositions may 
contain pharmaceutical^ acceptable auxiliary substances as required to approximate 
physiological conditions such as pH adjusting and buffering agents, toxicity adjusting agents 
and the like, e.g., sodium acetate, sodium chloride, potassium chloride, calcium chloride, 
5 sodium lactate and the like. The concentration of active agent in these formulations can vary 
widely, and will be selected primarily based on fluid volumes, viscosities, body weight and 
the like in accordance with the particular mode of administration selected and the patient's 
needs (e.g., Remington's Pharmaceutical Science (15th ed, 1980) and Goodman & Gillman, 
The Pharmacologic Basis of Therapeutics (Hardman et a7.,eds., 1996)). 

10 Thus, a typical pharmaceutical composition for intravenous administration 

would be about 0.1 to 10 mg per patient per day. Dosages from 0.1 up to about 100 mg per 
patient per day may be used, particularly when the drug is administered to a secluded site and 
not into the blood stream, such as into a body cavity or into a lumen of an organ. 
Substantially higher dosages are possible in topical administration. Actual methods for 

15 preparing parenterally administrable compositions will be known or apparent to those skilled 
in the art, e.g., Remington's Pharmaceutical Science and Goodman and Gillman, The 
Pharmacologial Basis of Therapeutics, supra. 

The compositions containing modulators of prostate cancer proteins can be 
administered for therapeutic or prophylactic treatments. In therapeutic applications, 

20 compositions are administered to a patient suffering from a disease (e.g., a cancer) in an 
amount sufficient to cure or at least partially arrest the disease and its complications. An 
amount adequate to accomplish this is defined as a "therapeutically effective dose." Amounts 
effective for this use will depend upon the severity of the disease and the general state of the 
patient's health. Single or multiple administrations of the compositions may be administered 

25 depending on the dosage and frequency as required and tolerated by the patient. In any event, 
the composition should provide a sufficient quantity of the agents of this invention to 
effectively treat the patient. An amount of modulator that is capable of preventing or slowing 
the development of cancer in a mammal is referred to as a "prophylactically effective dose." 
The particular dose required for a prophylactic treatment will depend upon the medical 

30 condition and history of the mammal, the particular cancer being prevented, as well as other 
factors such as age, weight, gender, administration route, efficiency, etc. Such prophylactic 
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treatments may be used, e.g., in a mammal who has previously had cancer to prevent a 
recurrence of the cancer, or in a mammal who is suspected of having a significant likelihood 
of developing cancer. 

It will be appreciated that the present prostate cancer protein-modulating 
5 compounds can be administered alone or in combination with additional prostate cancer 
modulating compounds or with other therapeutic agent, e.g., other anti-cancer agents or 
treatments. 

In numerous embodiments, one or more nucleic acids, e.g., polynucleotides 
comprising nucleic acid sequences set forth in Tables 1-16, such as antisense polynucleotides 

10 or ribozymes, will be introduced into cells, in vitro or in vivo. The present invention provides 
methods, reagents, vectors, and cells useful for expression of prostate cancer-associated 
polypeptides and nucleic acids using in vitro (cell-free), ex vivo or in vivo (cell or 
organism-based) recombinant expression systems. 

The particular procedure used to introduce the nucleic acids into a host cell for 

15 expression of a protein or nucleic acid is application specific. Many procedures for 

introducing foreign nucleotide sequences into host cells may be used. These include the use 
of calcium phosphate transfection, spheroplasts, electroporation, liposomes, microinjection, 
plasma vectors, viral vectors and any of the other well known methods for introducing cloned 
genomic DNA, cDNA, synthetic DNA or other foreign genetic material into a host cell {see, 

20 e.g. t Berger & Kimmel, Guide to Molecular Cloning Techniques, Methods in Enzymology 
volume 152 (Berger), Ausubel et al., eds., Current Protocols (supplemented through 1999), 
and Sambrook et aU Molecular Cloning - A Laboratory Manual (2nd ed, Vol. 1-3, 1989. 

In a preferred embodiment, prostate cancer proteins and modulators are 
administered as therapeutic agents, and can be formulated as outlined above. Similarly, 

25 prostate cancer genes (including both the full-length sequence, partial sequences, or 

regulatory sequences of the prostate cancer coding regions) can be administered in a gene 
therapy application. These prostate cancer genes can include antisense applications, either as 
gene therapy (i.e. for incorporation into the genome) or as antisense compositions, as will be 
appreciated by those in the art. 

30 Prostate cancer polypeptides and polynucleotides can also be administered as 

vaccine compositions to stimulate HTL, CTL and antibody responses,. Such vaccine 
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compositions can include, e.g., lipidated peptides (see, e.g.,Vitiello, A. etal., J. Clin. Invest. 
95:341 (1995)), peptide compositions encapsulated in poly(DL-lactide-co-glycolide) ("PLG") 
microspheres (see, e.g., Eldridge, et al, Molec. Immunol 28:287-294, (1991); Alonso et aU 
Vaccine 12:299-306 (1994); Jones et al, Vaccine 13:675-681 (1995)), peptide compositions 
5 contained in immune stimulating complexes (ISCOMS) (see, e.g., Takahashi et al, Nature 
344:873-875 (1990); Hu et al., Clin Exp Immunol 113:235-243 (1998)), multiple antigen 
peptide systems (MAPs) (see, e.g., Tarn, Proc. Natl. Acad. Sci. U.SA. 85:5409-5413 (1988); 
Tarn, J. Immunol Metlwds 196:17-32 (1996)), peptides formulated as multivalent peptides; 
peptides for use in ballistic delivery systems, typically crystallized peptides, viral delivery 

10 vectors (Perkus, et al., In: Concepts in vaccine development (Kaufmann, ed., p. 379, 1996); 
Chakrabarti, et al., Nature 320:535 (1986); Hu et al., Nature 320:537 (1986); Kieny, et al., 
AIDS Bio/Technology 4:790 (1986); Top et al., J. Infect. Dis. 124:148 (1971); Chanda et al., 
Virology 175:535 (1990)), particles of viral or synthetic origin (see, e.g., Kofler et al, J. 
Immunol Methods. 192:25 (1996); Eldridge et al, Sem Hematol. 30:16 (1993); Falo et al, 

15 Nature Med. 7:649 (1995)), adjuvants (Warren et al, Annu. Rev. Immunol 4:369 (1986); 
Gupta etal., Vaccine 11:293 (1993)), liposomes (Reddy etal, J. Immunol. 148:1585 (1992); 
Rock, Immunol Today 17:131 (1996)), or, naked or particle absorbed cDNA (Ulmer, et al., 
Science 259:1745 (1993); Robinson et al., Vaccine 1 1:957 (1993); Shiver et al, In: Concepts 
in vaccine development (Kaufmann, ed., p. 423, 1996); Cease & Beizofsky, Annu. Rev. 

20 Immunol 12:923 (1994) and Eldridge et al, Sem. Hematol 30:16 (1993)). Toxin-targeted 
delivery technologies, also known as receptor mediated targeting, such as those of Avant 
Immunotherapeutics, Inc. (Needham, Massachusetts) may also be used 

Vaccine compositions often include adjuvants. Many adjuvants contain a 
substance designed to protect the antigen from rapid catabolism, such as aluminum hydroxide 

25 or mineral oil, and a stimulator of immune responses, such as lipid A, Bortadella pertussis or 
Mycobacterium tuberculosis derived proteins. Certain adjuvants are commercially available 
as, e.g., Freund's Incomplete Adjuvant and Complete Adjuvant (Difco Laboratories, Detroit, 
MI); Merck Adjuvant 65 (Merck and Company, Inc., Rahway, NJ); AS-2 (SmithKline 
Beecham, Philadelphia, PA); aluminum salts such as aluminum hydroxide gel (alum) or 

30 aluminum phosphate; salts of calcium, iron or zinc; an insoluble suspension of acylated 
tyrosine; acylated sugars; cationically or anionically derivatized polysaccharides; 
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polyphosphazenes; biodegradable microspheres; monophosphoryl lipid A and quil A. 
Cytokines, such as GM-CSF, interleukin-2, -7, -12, and other like growth factors, may also be 
used as adjuvants. 

Vaccines can be administered as nucleic acid compositions wherein DNA or 
5 RNA encoding one or more of the polypeptides, or a fragment thereof, is administered to a , 
patient. This approach is described, for instance, in Wolff et. a/., Science 247:1465 (1990) as 
well as U.S. Patent Nos. 5,580,859; 5,589,466; 5,804,566; 5,739,118; 5,736,524; 5,679,647; 
WO 98/04720; and in more detail below. Examples of DNA-based delivery technologies 
include "naked DNA", facilitated (bupivicaine, polymers, peptide-mediated) delivery, 

10 cationic lipid complexes, and particle-mediated ("gene gun") or pressure-mediated delivery 
(see, e.g„ U.S. Patent No. 5,922,687). 

For therapeutic or prophylactic immunization purposes, the peptides of the 
invention can be expressed by viral or bacterial vectors. Examples of expression vectors 
include attenuated viral hosts, such as vaccinia or fowlpox. This approach involves the use of 

15 vaccinia virus, e.g., as a vector to express nucleotide sequences that encode prostate cancer 
polypeptides or polypeptide fragments. Upon introduction into a host, the recombinant 
vaccinia virus expresses the immunogenic peptide, and thereby elicits an immune response. 
Vaccinia vectors and methods useful in immunization protocols are described in, e.g., U.S. 
Patent No. 4,722,848. Another vector is BCG (Bacille Calmette Guerin). BCG vectors are 

20 described in Stover et aL 9 Nature 351:456-460 (1991). A wide variety of other vectors useful 
for therapeutic administration or immunization e.g. adeno and adeno-associated virus vectors, 
retroviral vectors, Salmonella typhi vectors, detoxified anthrax toxin vectors, and the like, 
will be apparent to those skilled in the art from the description herein (see, e.g., Shata et aL t 
Mol Med Today 6:66-71 (2000); Shedlock et aUJLeukoc Biol 68:793-806 (2000); Hipp et 

25 a/., In Vivo 14:571-85 (2000)). 

Methods for the use of genes as DNA vaccines are well known, and include 
placing a prostate cancer gene or portion of a prostate cancer gene under the control of a 
regulatable promoter or a tissue-specific promoter for expression in a prostate cancer patient. 
The prostate cancer gene used for DNA vaccines can encode full-length prostate cancer 

30 proteins, but more preferably encodes portions of the prostate cancer proteins including 
peptides derived from the prostate cancer protein. In one embodiment, a patient is 
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immunized with a DNA vaccine comprising a plurality of nucleotide sequences derived from 
a prostate cancer gene. For example, prostate cancer-associated genes or sequence encoding 
subfragments of a prostate cancer protein are introduced into expression vectors and tested 
for their immunogenicity in the context of Class I MHC and an ability to generate cytotoxic T 
5 cell responses. This procedure provides for production of cytotoxic T cell responses against 
cells which present antigen, including intracellular epitopes. 

In a preferred embodiment, the DNA vaccines include a gene encoding an 
adjuvant molecule with the DNA vaccine. Such adjuvant molecules include cytokines that 
increase the immunogenic response to the prostate cancer polypeptide encoded by the DNA 

10 vaccine. Additional or alternative adjuvants are available. 

In another preferred embodiment prostate cancer genes find use in generating 
animal models of prostate cancer. When the prostate cancer gene identified is repressed or 
diminished in cancer tissue, gene therapy technology, e.g., wherein antisense RNA directed 
to the prostate cancer gene will also diminish or repress expression of the gene. Animal 

15 models of prostate cancer find use in screening for modulators of a prostate cancer-associated 
sequence or modulators of prostate cancer. Similarly, transgenic animal technology 
including gene knockout technology, e.g. as a result of homologous recombination with an 
appropriate gene targeting vector, will result in the absence or increased expression of the 
prostate cancer protein. When desired, tissue-specific expression or knockout of the prostate 

20 cancer protein may be necessary. 

It is also possible that the prostate cancer protein is overexpressed in prostate 
cancer. As such, transgenic animals can be generated that overexpress the prostate cancer 
protein. Depending on the desired expression level, promoters of various strengths can be 
employed to express the transgene. Also, the number of copies of the integrated transgene 

25 can be determined and compared for a determination of the expression level of the transgene. 
Animals generated by such methods find use as animal models of prostate cancer and are 
additionally useful in screening for modulators to treat prostate cancer. 

Kits for Use in Diagnostic and/or Prognostic Applications 

30 For use in diagnostic, research, and therapeutic applications suggested above, 

kits are also provided by the invention. In the diagnostic and research applications such kits 
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may include any or all of the following: assay reagents, buffers, prostate cancer-specific 
nucleic acids or antibodies, hybridization probes and/or primers, antisense polynucleotides, 
ribozymes, dominant negative prostate cancer polypeptides or polynucleotides, small 
molecules inhibitors of prostate cancer-associated sequences etc. A therapeutic product may 
include sterile saline or another pharmaceutically acceptable emulsion and suspension base. 
In addition, the kits may include instructional materials containing directions 
(i.e., protocols) for the practice of the methods of this invention. While the instructional 
materials typically comprise written or printed materials they are not limited to such. Any 
medium capable of storing such instructions and communicating them to an end user is 
contemplated by this invention. Such media include, but are not limited to electronic storage 
media (e.g., magnetic discs, tapes, cartridges, chips), optical media (e.g., CD ROM), and the 
like. Such media may include addresses to internet sites that provide such instructional 
materials. 

The present invention also provides for kits for screening for modulators of 
prostate cancer-associated sequences. Such kits can be prepared from readily available 
materials and reagents. For example, such kits can comprise one or more of the following 
materials: a prostate cancer-associated polypeptide or polynucleotide, reaction tubes, and 
instructions for testing prostate cancer-associated activity. Optionally, the kit contains 
biologically active prostate cancer protein. A wide variety of kits and components can be 
prepared according to the present invention, depending upon the intended user of the kit and 
the particular needs of the user. Diagnosis would typically involve evaluation of a plurality 
of genes or products. The genes will be selected based on correlations with important 
parameters in disease which may be identified in historical or outcome data. 
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EXAMPLES 

Example 1: Tissue Preparation, Labeling Chips, and Fingerprints 

5 Purifying total RNA from tissue sample using TRIzol Reagent 

The sample weight is first estimated. The tissue samples are homogenized in 
1 ml of TRIzol per 50 mg of tissue using a homogenizer (e.g., Polytron 3100). The size of 
the generator/probe used depends upon the sample amount. A generator that is too large for 
the amount of tissue to be homogenized will cause a loss of sample and lower RNA yield. A 
10 larger generator (e.g., 20 mm) is suitable for tissue samples weighing more than 0.6 g. Fill 
tubes should not be overfilled. If the working volume is greater than 2 ml and no greater than 
10 ml, a 15 ml polypropylene tube (Falcon 2059) is suitable for homogenization. 

Tissues should be kept frozen until homogenized. The TRIzol is added 
directly to the frozen tissue before homogenization. Following homogenization, the insoluble 
15 material is removed from the homogenate by centrifugation at 7500 x g for 15 min. in a 

Sorvall superspeed or 12,000 x g for 10 min. in an Eppendorf centrifuge at 4°C. The cleared 
homogenate is then transferred to a new tube(s). Samples may be frozen and stored at -60 to 

-70°C for at least one month or else continue with the purification. 

The next process is phase separation. The homogenized samples are incubated 

20 for 5 minutes at room temperature. Then, 0.2 ml of chloroform per 1ml of TRIzol reagent is 
added to the homogenization mixture. The tubes are securely capped and shaken vigorously 
by hand (do not vortex) for 15 seconds. The samples are then incubated at room temp, for 
2-3 minutes and next centrifuged at 6500 rpm in a Sorvall superspeed for 30 min. at 4oC. 

The next process is RNA Precipitation. The aqueous phase is transferred to a 

25 fresh tube. The organic phase can be saved if isolation of DNA or protein is desired. Then 
0.5 ml of isopropyl alcohol is added per 1ml of TRIzol reagent used in the original 
homogenization. Then, the tubes are securely capped and inverted to mix. The samples are 
then incubated at room temp, for 10 minutes an centrifuged at 6500 rpm in Sorvall for 20 

min. at4°C. 
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The RNA is then washed. The supernatant is poured off and the pellet washed 
with cold 75% ethanol. 1 ml of 75% ethanol.is used per 1 ml of the TRIzol reagent used in 
the initial homogenization. The tubes are capped securely and inverted several times to 
loosen pellet without vortexing . They are next centrifuged at <8000 rpm (<7500 x g) for 5 
5 minutes at 4°C 

The RNA wash is decanted. The pellet is carefully transferred to an 
Eppendorf tube (sliding down the tube into the new tube by use of a pipet tip to help guide it 
in if necessary). Tube(s) sizes for precipitating the RNA depending on the working volumes. 
Larger tubes may take too long to dry. Dry pellet. The RNA is then resuspended in an 
10 appropriate volume (e.g., 2 -5 ug/ul) of DEPC HaO. The absorbance is then measured 

The poly A+ mRNA may next be purified from total RNA by other methods 

such as Qiagen' s RNeasy kit. The poly A + mRNA is purified from total RNA by adding the 
oligotex suspension which has been heated to 37°C and mixing prior to adding to RNA. 
The Elution Buffer is incubated at 70°C. If there is precipitate in the buffer, warm up the 2 x 
15 Binding Buffer at 65°C. The the total RNA is mixed with DEPC-treated water, 2 x Binding 
Buffer, and Oligotex according to Table 2 on page 16 of the Oligotex Handbook and next 
incubated for 3 minutes at 65°C and 10 minutes at room temperature. 

The preparation is centrifuged for 2 minutes at 14,000 to 18,000 g, preferably, 
at a "soft setting," The supernatant is removed without disturbing Oligotex pellet A little bit 
20 of solution can be left behind to reduce the loss of Oligotex. The supernatant is saved until 
satisfactory binding and elution of poly A + mRNA has been found. 

Then, the preparation is gently resuspended in Wash Buffer OW2 and pipetted 
onto the spin column and centrifuged at full speed (soft .setting if possible) for 1 minute. 

Next, the spin column is transferred to a new collection tube and gently 
25 resuspended in Wash Buffer OW2 and centrifuged as described herein. 

Then, the spin column is transferred to a new tube and eluted with 20 to 100 ul 
of preheated (70°C) Elution Buffer. The Oligotex resin is gently resuspended by pipetting up 
and down. The centrifugation is repeated as above and the elution repeated with fresh elution 
buffer or first eluate to keep the elution volume low. 
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The absorbance is next read to determine the yield, using diluted Elution 
Buffer as the blank. 

Before proceeding with cDNA synthesis, the mRNA is precipitated before 
proceeding with cDNA synthesis, as components leftover or in the Elution Buffer from the 
5 Oligotex purification procedure will inhibit downstream enzymatic reactions of the mRNA. 
0.4 vol of 7.5 M NH40Ac + 2.5 vol. of cold 100% ethanol is added and the preparation 
precipitated at -20°C 1 hour to overnight (or 20-30 rain, at -70°C), and centrifuged at 
14,000-16,000 x g for 30 minutes at 4°C. Next, the pellet is washed with 0.5 ml of 80% 
ethanol (-20°C) and then centrifuged at 14,000-16,000 x g for 5 minutes at room temperature. 
10 The80% ethanol wash is then repeated The last bit of ethanol from the pellet is then dried 
without use of a speed vacuum and the pellet is then resuspended in DEPC H 2 0 at lug/ul 
concentration. 

Alternatively the RNA may be purified using other methods (e.g., Oiapen's RNeasv kit). 

15 No more than 100 ug is added to the RNeasy column. The sample volume is 

adjusted to 100 ul with RNase-free water. 350 ul Buffer RLT and then 250 ul ethanol 
(100%) are added to the sample. The preparation is then mixed by pipetting and applied to an 
RNeasy mini spin column for centrifugation (15 sec at >10,000 rpm). If yield is low, reapply 
the flowthrough to the column and centrifuge again. 

20 Then, transfer column to a new 2 ml collection tube and add 500 ul Buffer 

RPE and centrifuge for 15 sec at >10,000 rpm. The flowthrough is discarded. 500 ul Buffer 
RPE and is then added and the preparation is centriuged for 15 sec at >10,000 rpm. The 
flowthrough is discarded, and the column membrane dried by centrifuging for 2 min at 
maximum speed. The column is transferred to a new 1.5-ml collection tube. 30-50 ul of 

25 RNase-free water is applied directly onto column membrane. The column is then centrifuged 
for 1 min at >10,000 rpm and the elution step repeated. 

The absorbance is then read to determine yield. If necessary, the material may 
be ethanol precipitated with ammonium acetate and 2.5X volume 100% ethanol. 

30 First Strand cDNA Synthesis 
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The first strand can be make using using Gibco's "Superscript Choice System 
for cDNA Synthesis" kit. The starting material is 5 ug of total RNA or 1 ug of polyA+ 
mRNAl. For total RNA, 2 ul of Superscript RT is used; for polyA+ mRNA, 1 ul of 
Superscript RT is used. The final volume of first strand synthesis mix is 20 ul. The RNA 
5 should be in a volume no greater than 10 ul. The RNA is incubated with 1 ul of 100 pmol 
T7-T24 oligo for 10 min at 70°C followed by addition on ice of 7 ul of: 4ul 5X 1 st Strand 
Buffer, 2 ul of 0.1M DTT, and 1 ul of lOmM dNTP mix. The preparation is then incubated at 
37°C for 2 min before addition of the Superscript RT followed by incubation at 37°C for 1 
hour. 

10 

Second Strand Synthesis 

For the second strand synthesis, place 1st strand reactions on ice and add: 91 
ul DEPC H 2 0; 30 ul 5X 2nd Strand Buffer; 3 ul lOmM dNTP mix; 1 ul 10 U/ul E.coli DNA 
Iigase; 4 ul 10 U/ul Rcoli DNA Polymerase; and 1 ul 2 U/ul RNase H. Mix and incubate 2 
15 hours at 16°C. Add 2 ul T4 DNA Polymerase. Incubate 5 min at 16°C. Add 10 ul of 0.5M 
EDTA. 



Cleaning upcDNA 

The cDNA is purified using Phenol:Chloroform:Isoamyl Alcohol (25:24:1) 

20 and Phase-Lock gel tubes. The PLG tubes are centrifuged for 30 sec at maximum speed. 
The cDNA mix is then transferred to PLG tube. An equal volume of 
phenol:chloroform:isamyl alcohol is then added, the preparation shaken vigorously (no 
vortexing), and centrifuged for 5 minutes at maximum speed. The top aqueous solution is 
transferred to a new tube and ethanol precipitated by adcling 7.5X 5M NH40Ac and 2.5X 

25 volume of 100% ethanol. Next, it is centrifuged immediately at room temperature for 20 
min, maximum speed. The supernatant is removed, and the pellet washed with 2X with cold 
80% ethanol. As much ethanol wash as possible should be removed before air drying the 
pellet; and resuspending it in 3 ul RNase-free water. 
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In vitro Transcription (TVT) and labeling with biotin 

In vitro Transcription (TVT) and labeling with biotin is performed as follows: 
Pipet 1.5 ul of cDNA into a thin-wall PCR tube. Make NTP labeling mix by combining 2 ul 
T7 lOxATP (75 mM) (Ambion); 2 ul T7 lOxGTP (75 mM) (Ambion); 1.5 ul T7 lOxCTP (75 
5 mM) (Ambion); 1.5 ul T7 lOxUTP (75 mM) (Ambion); 3.75 ul 10 mM Bio-ll-UTP 
(Boehringer-Mannheim/Roche or Enzo); 3.75 ul 10 mM Bio-16-CTP (Enzo); 2 ul lOx T7 
transcription buffer (Ambion); and 2 ul lOx T7 enzyme mix (Ambion). The final volume is 
20 ul. Incubate 6 hours at 37°C in a PCR machine. The RNA can be furthered cleaned. 
Clean-up follows the previous instructions for RNeasy columns or Qiagen's RNeasy protocol 

10 handbook. The cRNA often needs to be ethanol precipitated by resuspension in a volume 
compatible with the fragmentation step. 

Fragmentation is performed as follows. 15 ug of labeled RNA is usually 
fragmented. Try to minimize the fragmentation reaction volume; a 10 ul volume is 
recommended but 20 ul is all right. Do not go higher than 20 ul because the magnesium in 

15 the fragmentation buffer contributes to precipitation in the hybridization buffer. Fragment 
RNA by incubation at 94 C for 35 minutes in 1 x Fragmentation buffer (5 x Fragmentation 
buffer is 200 mM Tris-acetate, pH 8.1; 500 mM KOAc; 150 mM MgOAc). The labeled 
RNA transcript can be analyzed before and after fragmentation. Samples can be heated to 
65°C for 15 minutes and electrophoresed on 1% agarose/TBE gels to get an approximate idea 

20 of the transcript size range. 

For hybridization, 200 ul (10 ug cRNA) of a hybridization mix is put on the 
chip. If multiple hybridizations are to be done (such as cycling through a 5 chip set), then it 
is recommended that an initial hybridization mix of 300 ul or more be made. The 
hybridization mix is: fragment labeled RNA (50 ng/ul final cone); 50 pM 948-b control 

25 oligo; 1.5 pM BioB; 5 pM BioC; 25 pM BioD; 100 pM CRE; 0. 1 mg/ml herring sperm DNA; 
0.5 mg/ml acetylated BSA; and 300 ul with IxMES hyb buffer. 

The hybridization reaction is conducted with non-biotinylated IVT (purified 
by RNeasy columns) (see example 1 for steps from tissue to IVT): The following mixture is 
prepared: 



95 



WO 02/30268 



PCT/US01/32045 



IVT antisense RNA; 4 pg: pi 
Random Hexamers (1 fJig/pl): 4 pl 
H 2 0: ul 

14 pi 

5 Incubate the above 14 pi mixture at 70°C for 10 min.; then put on ice. 

The Reverse transcription procedure uses the following mixture: 
0.1MDTT: 3 pi 

50XdNTPmix: 0.6 pi 

H 2 Q: 2.4 pi 

10 Cy3 or Cy5 dUTP (ImM): 3 pi 

SS RT II (BRL): 1 pl 



16 pi 

The above solution is added to the hybridization reaction and incubated for 30 min., 42°C. 
15 Then, 1 pl SSII is added and incubated for another hour before being placed on ice. 

The 50X dNTP mix contains 25mM of cold dATP, dCTP, and dGTP, lOmM 

of dTTP and is made by adding 25 jil each of lOOmM dATP, dCTP, and dGTP; 10 pl of 

100mMdTTPtol5plH 2 O.] 

RNA degradation is performed as follows. Add 86 pl H20, 1.5 pl 1M NaOH/ 
20 2 mM EDTA and incubate at 65°C, 10 min.. For U-Con 30, 500 pl TE/sample spin at 7000 g 

for 10 min, save flow through for purification. For Qiagen purification, suspend u-con 

recovered material in 500 pl buffer PB and proceed using Qiagen protocol. For DNAse 

digestion, add 1 ul of 1/100 dilution of DNAse/30 ul Rx and incubate at 37°C for 15 min. 

Incubate at 5 min 95°C to denature the DNAse. 

25 

Sample preparation 

For sample preparation, add Cot-1 DNA, 10 pl; SOX dNTPs, 1 pl; 20X SSC, 
2.3 pl; Na pyro phosphate, 7.5 pl; 10 mg/ml Herring sperm DNA; 1 ul of 1/10 dilution to 
21.8 final vol. Dry in speed vac. Resuspend in 15 pl H 2 0. Add 0.38 pl 10% SDS. Heat 
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95°C, 2 min and slow cool at room temp, for 20 min. Put on slide and hybridize overnight at 
64°C. Washing after the hybridization: 3X SSC/0.03% SDS: 2 min., 37.5 ml 20X 
SSC+0.75ml 10% SDS in 250ml H 2 0; IX SSC: 5 min., 12.5 ml 20X SSC in 250ml H 2 0; 
0.2X SSC: 5 min., 2.5 ml 20X SSC in 250ml H 2 0. Dry slides and scan at appropiate PMT's 
5 and channels. 

Example 2: Taxol resistant Xenograft Model of Human Prostate Cancer 

Treatment regimens that include paclitaxel (Taxol; Bristol-Myers Squibb 
10 Company, Princeton, NJ) have been particularly successful in treating hormone-refractory 
prostate cancer in the phase II setting (Smith et al., Semin. Oncol. 26(1 Suppl 2): 109-11 
(1999)). However, many patients develop tumors which are initially, or later become, 
resistant to taxol. To identify genes that may be involved with resistance to taxol, or are 
regulated in response to taxol resistance, and therefore may be used to treat, or identify, taxol 
15 resistance in patients, the following experiments were carried out. 

The androgen-independent human cell line CWR22R was grown as a 
xenograft in nude mice (Nagabhushan et al., Cancer Res. 56(13):3042-3046 (1996); Agus et 
al., J. Natl. Cancer Inst.91(21): 1869-1876 (1999); Bubendorf et al., J. Nad. Cancer Inst. 

20 91(20):1758-1764 (1999)). Initially, these xenograft tumors were sensitive to therapeutic 
doses of taxol. The mice were treated continuously with sub-therapeutic doses, and the 
tumors were allowed to grow for 3-4 weeks, before surgical removal of the tumors. The 
tumor from an individual mouse was then minced, and a small portion was then injected into 
a healthy nude mouse, establishing a second 

25 passage of the tumor. This mouse was then treated continuously with the 

same sub-therapeutic dose of taxol. This process was repeated 14 times, and a portion of 
each generation of xenograft tumor was collected. There was increasing resistance to 
therapeutic doses of taxol with each generation. Bythe end of the process, the tumors were 
fully resistant to therapeutic doses of taxol. RNA from each generation of tumor was then 

30 isolated, and individual mRNA species were quantified using a custom Affymetrix 

GeneChip® oligonucleotide microarray, with probes to interrogate approximately 35,000 
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unique mRNA transcripts. Genes were selected that showed a statistically significant up- 
regulation, or down-regulation, during the subsequent generations of increasingly taxol- 
resistant tumors. Only one gene was significantly up-regulated, whereas 24 genes were 
down-regulated; these are presented in Table 10. 
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The gene sequences identified to be overexpressed in prostate cancer may be 
used to identify coding regions from the public DNA database. The sequences may be used 
to either identify genes that encode known proteins, or they may be used to predict the coding 
regions from genomic DNA using exon prediction algorithms, such as FGENESH (Salamov 
and Solovyev, 2000, Genome Res. 10:516-522). In addition, one of ordinary skill in the art 
would understand how to obtain the unigene cluster identification and sequence information 
according to the exemplar accession numbers provided in Tables 1-16. (see, 
http://www.ncbi.nlm.nih.gov/UniGene/). 
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TABLE1: shows genes, including expression sequence tags, differentially expressed in 
prostate tumor tissue compared to normal tissue as analyzed using the Affymetrix/Eos HuOl 
GeneChip array. Shown are the relative amounts of each gene expressed in prostate tumor 
samples and various normal tissue samples showing the highest expression of the gene. 



Pkey: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelD: Unigene number 

UnigeneTiUe: Unigene gene title 

R1 : Ratio of tumor to norma! body tissue 



Pkey UnigenelD ExAccn 


1 tntnnatlA TIHa 

unuigene nue 


R1 


131919 HS.2724oo AA121266 


cols 


rnn 

Of 


120328 HS29Q9Q5 AA1 96979 


COT*.- UfnnUi rSrnHlar fn MnfTmA nn\ atn 

cols, weawy sirraiar to (oe tune noi ava 


MC 
OC.O 


105201 HS31412 AA1 95626 


CCTb 

cSTS 


on 1 

Ov. I 


101486 Hs.1852 M24902 


acid phosphatase; prostate 


91% O 


119073 Hs^ 79477 R32894 


ESTs 


OA R 
C**J> 


133428 Hs.1 83752 M34376 


rnicroseminoprotein; bata- 




128180 Hs.1 71995 AA595348 


kainkrein 3; (prostate spectttc antigen 


91 A 


104080 HS57771 AA402971 


Homo sapiens mRNA for serine protease (T 


1R Q 

10.9 


127537 Hs.1 62859 AA569531 


ESTs 


1R ft 
1 0.0 


131665 H&3D340 R22139 


to IS 


174 


101050 HS.1 832 KQT911 


neuropepooe y 


1 1.0 


130771 Hs.1 915 N48056 


folate hydrolase (jsrostate-spectuc rnemo 


17 

I / 


108153 Hs.40808 AAD54237 


ESTs 


icq 


107485 Ks2 62476 W63793 


S-adenosytrnetniorune ae carboxylase i 


1R7 
10./ 


IUd loo ns.iX£co/ M/v»cooiiy 


COI 5 


165 


129534 Hs.1 1260 R73640 


ESTs 


16.4 


100569 Hs.171995 HG2261-HT2351 


Antigen, Prostate Specific AH Splice 


16 


101889 Hs.181350 S39329 


kal&kreln 2; prostatic 


154 


135389 Hs39872 U05237 


fetal Alzheimer antigen 


15 


101506 H&62192 M27436 


coagulation factor ill (thromboplastin; 


13,9 


134374 Hs3236 D62633 


ESTs 


12.7 


133944 Hs.7780 AA045870 


ESTs 


12J 


109141 Hs.193380 AA176428 


ESTs 


125 


130974 Hs2178 X57985 


H2B histone family; member Q 


115 


114768 Hs.182339 AA14S007 


ESTs 


113 


104394 Hs.172129 H46617 


yp19h1 rl Soares breast 3NbHBst Homo sap 


113 


125299 Hs.102720 Z39436 


ESTs 


11JB 


104660 Hs.14846 AA007160 


ESTs 


114 


100116 Hs.76045 000654 


actin; gamma 2; smooth musde; enteric 


11 


131061 Hs268744 N64328 


ESTs; Moderately similar to KIAA0273 (H. 


103 


126645 126645 A1167942 


Homo sapiens BAG clone RG041 D1 1 from 7q2 


107 


135153 Hs35420 N40141 


Homo sapiens mRNA for JM27 protein; comp 


10J6 


107033 Hs.1 13314 AA599629 


ESTs 


- 103 


118417 N66048 


ESTs; Weakly similar to polymerase \Hsa 


103 


126758 Hs.293960 W37145 


ESTs 


102 


115674 Hs.8364 AA406542 


ESTs 


10.1 


134989 Hs.92381 AA236324 


ESTs; Weakly similar to !i !! ALU CLASS A 


10.1 


107102 Hs.30652 AA609723 


ESTs 


10.1 


116787 Hs.15641 H28581 


ESTs 


10.1 


115719 Hs59622 AA416997 


ESTs 


10 


123209 Hs£Q3270 AA489711 


ESTs 


9.9 


101664 Hs.121017 M60752 


H2A histone family; member A 


93 


112971 Hs.83883 T17185 


ESTs 


97 


102519 Hs3Q296 U52969 


Purkinje cell protein 4 


97 


117984 Hs.106778 N51919 


ESTs 


97 


105840 H&22209 AA398533 


ESTs 


94 


129523 HS274509 M30894 


T-celi receptor; gamma cluster 


94 


132964 Hs.167133 AA031360 


ESTs 


92 


121853 Hs385G2 AA425887 


ESTs 


9 
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115764 Hs51011 AA421562 anterior gradient 2 (Xenopus laevis; sec 85 

119617 Hs55999 W47380 ESTs 8.0 

100552 Hs.301846 HG2167-HT2237 Protein Kinase Ht31,Cajrtp-Dependerrt 8.9 

105627 H&23317 AA281245 ESTs 83 

5 101461 Hs.76422 M22430 phospholipase A2; group IIA (platelets; 87 

131725 H&31146 AA456264 ESTs; Highly similar to (defline not ava 85 

124526 H&293185 N620S6 yz61c5.s1 Soares_mu(tipte_sderoslsJNbH 8.5 

118528 Hs.49397 N67889 ESTs 8.2 

133845 Hs.76704 T68510 ESTs 82 

10 133354 Hs.334762 AA055552 ESTs; WeaWy similar to KIAA031 9 [H.sapi 6.1 

105912 H&20415 AA402000 ESTs; Weakly simBar to GS3786 (H^apien 8 

119018 H&27B695 N95786 ESTs 8 

100394 Hs.66052 D84276 CD38 antigen (p45) 6 

114132 HS24192 Z38688 ESTs 75 

15 116786 Hs501527 H25836 tumor necrosis factor (ligand) superfami 7.7 

106579 H&23023 AA456135 ESTs 75 

128790 Hs.105700 AA291725 secreted frizzted-related protein 4 75 

> 114965 Hs.72472 AA250737 ESTs 7A 

112033 H&22627 R43162 ESTs 7.1 

20 102398 U42359 Human N33 protein form 1 (N33) gene, exo 7 

101201 H&2256 L22524 matrix metaBoproteinase 7 (matritysin; 65 

109272 Hs-288462 AA19571B ESTs 6.9 

103145 Hs.169849 X66276 myosan-binding protein C; slow-type 65 

101803 Hs.155691 M86546 pre-B-ceQ leukemia transcription factor 65 

25 120562 H$5Q2267 AA280035 ESTs; Weakly similar to W01 A6.C (Celega 6.8 

109112 Hs257924 AA169379 ESTs 6.8 

109795 Hs526416 F10707 ESTs 6.7 

107532 Hs.173684 Z19643 ESTs; Weakty similar to (defline not ava 6.7 

130336 Hs.171995 X07730 kaifikrein 3; (prostate specific antigen 6.6 

30 131425 Hs.26691 AA219134 ESTs 6.6 

120588 Hs.16193 AA281591 Homo sapiens mRNA; cDNA DKFZp586B211 {fr 6.6 

132902 Hs59838 AA490969 ESTs 6.6 

125674 Hs.323378 W28078 H^aplens mRNA for transmembrane protein 6.6 

133724 Hs.75746 U07919 aldehyde dehydrogenase 6 6.5 

35 130343 Hs27B628 AA490262 ESTs; Moderately similar to APXL gene pr 63 

120215 Hs.108787 Z41050 Homo sapiens Mcd4p homotog mRNA; comptet 6.5 

129215 Hs.126085 AA176867 ESTs 65 

131881 Hs.3383 AA010163 upstream regulatory element binding prot 65 

133376 Hs.7232 T23670 ESTs 6.4 

40 105376 Hs.8768 AA236559 ESTs; Weakly similar to neuronal thread 6.4 

104674 H&26289 AA009527 ESTs 6.4 

100727 Hs534786 X07290 Human HF.12 gene mRNA 65 

130150 Ks.15113 AF000573 homogerrtisate lipoxygenase (homogenti 65 

121770 H&278428 AA421714 Homo sapiens mRNA for K1AAQ896 protein; 65 

45 123475 H&250528 AA599267 ESTs; Weakly similar to ANKYRIN; BRAIN V 65 

133061 Hs296638 AB000584 prostate differentiation factor 6.3 

116429 Hs279923 AA609710 ESTs; Weakly similar to similar to GTP-b 62 

101233 Hs578 L29008 sorbitol dehydrogenase 62 

104691 HS57744 AA011176 ESTs 6J2 

50 127248 AA325029 ES727953 Cerebeta 11 Homo sapiens cDNA 6.2 

127775 Hs.179902 H04106 ESTs; Weakly similar to (defline not ava 62 

105500 H&222399 AA256485 ESTs 6.1 

131463 Hs.2714 X74142 forkhead (Drosophi&HkB 1 - 6.1 

132116 Hs.40289 AA234767 ESTs 6 

55 130828 H&203213 AA053400 ESTs 55 

115357 Hs.72988 AA281793 ESTs 55 

105496 HS501997 AA256323 ESTs 5.7 

116334 Hs.46948 AA491457 ESTs 5.7 

107968 Hs.61539 AA034Q20 ESTs 5.7 

60 120132 Hs.125019 Z38839 EST* WeaWy similar to UU ALU SUBFAMI 5.6 

106375 HS289072 AA443993 ESTs 5.6 

132550 Hs.170195 AA029597 bone morphogenetic protein 7 (osteogenic 5.6 

124777 Hs.140237 R41933 ESTs; Weakly similar to neuronal thread 5.6 

100311 Hs537616 050640 phosphodiesterase 3B; cGMP-inhfoted 5.6 

65 101791 Hs.62354 M83822 Human beige-like protein (BQL) mRNA; par 55 

117698 Hs.45107 N410Q2 ESTs 5.5 

132387 Hs281434 R70914 heat shock 70kO protein 1 55 

122041 Hs.98732 AA431407 Homo sapiens Chromosome 16 BAG clone CIT 55 

133723 Hs.262476 AA088851 S-adenosylmeUiionlne decarboxylase 1 55 
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15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



113938 


W81598 


ESTs 


5.4 


133015 


Hs£46315 AA047036 


ESTs 


5.4 


125745 


Hs.75722 AI283493 


ribophortn II 


SA 


107295 


Hs*0120 T34527 


UDP-N-aatyl-alrjra-D^lactos^ 


5.4 


106188 


Hs.7780 AA056482 


ESTs 


5.3 


100184 


Hs.21223 D17408 


calponln 1; basic; smooth muscle 


5J3 


104466 


Hs.326392 N25110 


Human guanine nucleotide exchange factor 


5.3 


104033 


Hs.98944 AA365031 


ESTs 


5.3 


110844 


Hs.167531 N31952 


ESTs; Weakfy similar to (defline not ava 


5.3 


129056 


Hs.108336 H70627 


ESTs; Weakly similar to 111! ALU SUBFAMI 


5.3 


102805 


Hs.25351 U90304 


iroquols-dass homeodomain protein 


5.3 


133493 


Hs.194369 AA284143 


Homo sapiens chromosome 1 atrophtn-1 rel 


5.3 


129184 


Hs.109201 W26769 


ESTs; Highly similar to (defline not ava 


S2 


134158 


Hs.79428 U15174 


BCL2/adencvirus E1B 18kEWnteracfing pro 


B2 


107240 


Hs. 159872 059368 


ESTs 


$2 


104787 


AA027317 


ESTs; Weakly similar to nil ALU SUBFAMI 


$2 


123527 


Hs.108327 AA608679 


damage-specific DNA binding protein 1 (1 


55 


116646 


Hs.194228 F03048 


ESTs; Moderately similar to (1(1 ALU SUB 


52 


101448 


Hs.195850 M21389 


keratin 5 (epidermolysis bullosa simplex 


5.1 


116188 


Hs.184598 AM64728 


ESTs; Weakly similar to Gil ALU SUBFAMI 


5.1 


126259 


H&281428 Z21472 


ESTs; Moderately similar to Ml ALU SUB 


5.1 


105921 


Hs.169119 AM02613 


ESTs 


5.1 


103375 


HS54416 X91666 


sine oculis homeobox (Drosophfla) homoto 


5.1 


128871 


Hs.106778 AA40Q271 


ESTs; Highly simflar to (defline not ava 


5.1 


112681 


Hs.148932 R87331 


ESTs; Moderately similar to semaphorin V 


5.1 


105784 


Hs£26434 AA350771 


ESTs 


5.1 


116238 


Hs.47144 AM79362 


ESTs 


5 


102913 


Hs.80342 X07696 


keratin 15 


5 


103011 


Hs.326035 X52541 


early growth response 1 


5 


126023 


H58881 


yr36d09.r1 Soares fetal fiver spleen INF 


5 


103709 


Hs.13804 AA037316 


ESTs 


5 


118981 


Hs.39288 N93839 


ESTs; Weakly similar to 111! ALU SUBFAMI 


5 


134807 


Hs£9732 X78932 


zinc finger protein 273 


5 


100079 


H&23311 AB0Q2365 


Human mRNA for KIAA0367 gene; partial cd 


4.9 


132047 


Hs.3796 D83492 


EphBS 


4.9 


132680 


Hs.177537 AA444369 


ESTs 


4.9 


124049 


Hs.74519 F10523 


primass; polypeptide 2A (58kD) 


4.8 


133330 


Hs.71119 U42360 


Human N33 mRNA; complete cds 


4.8 


104776 


AA026349 


ESTs 


4.8 


122593 


Hs.128749 AA453310 


Homo sapiens aipha-rrtemylacyf-CoA racema 


4£ 


103912 


Hs.143087 AA251078 


ESTs 


4.8 


113961 


Hs-26009 W86307 


Homo sapiens mRNA tor K1AA0860 protein; 


4J8 


105288 


Hs3585 AA233168 


ESTs; Weakly similar to coded tor by C. 


4.8 


135035 


H&284186 H89575 


ESTs 


4.8 


104144 


Hs.183390 AA447439 


ESTs; Weakly similar to ZINC FINGER PROT 


A£ 


129389 


Hs.268126 AA621604 


ESTs 


4.8 


125982 


R98091 


RAE1 {RNA export 1 ; S.pombe) homobg 


4.8 


125162 


Hs.26243 W44682 


ESTs 


4.6 


103023 


Hs/1 17950 X53793 


muflif unctional polypeptide similar to S 


4.7 


129735 


W80701 


ESTs; Weakly similar to HERV-E envelope 


4.7 


104479 


Hs.106390 N36040 


ESTs 


4.7 


103731 


AA070545 


zm7c3 jl Stratagene neuroeplthelium (#93 


4.7 


126575 


Hs.127602 W72416 


ESTs 


. 4.7 


124578 


Hs231500 N66321 


Human glucose transporter-like protein-l 


4.7 


130617 


Hs.1674 M90516 


glutarrune-mJctose-6iJhospnate transamin 


4.7 


116752 


Hs.91622 H06373 


Homo sapiens clone 24456 mRNA sequence 


4.7 


100279 


Hs*2007 D42084 


Human mRNA for K1AA0094 gene; partial cd 


4.7 


126288 


Hs£9576 AI479264 


ESTs 


4.7 


131836 


Hs.32990 AA610086 


ESTs 


4.7 


106717 


Hs239489 AA465093 


TIA1 cytotoxic granule-associated RNA-toi 


4.7 


114542 


Hs£1011 AA055768 


ESTs 


4.6 


103806 


AA130614 


zo1f2jl Stratagene neuroepithefium NT2R 


4.6 


130529 


AA173238 


smafl inducible cytokine A5 (RANTES) 


4.6 


115675 


Hs.82065 AA406546 


ESTs 


4.6 


111386 


Hs£93798 N95326 


ESTs 


4.6 


106503 


Hs.29679 AA452411 


ESTs 


4.6 


119943 


Hs.14158 W86835 


copine 111 


4.6 


104459 


Hs.100070 M91493 


EST 


4.6 


100774 


Hs.89603 HG371-HT1063 


Mucin 1 f Epithelial, Aft. Splice 6 


46 
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100652 Hs.142653 HQ2825-HT2949 Ret Transforming Gene 4.6 

132015 Hs5731 D1190O ESTs 4,6 

126086 H70975 yr73g01 Jl Scares fetal fiver spleen 1NF 4.6 

130888 Hs.173094 F03819 ESTs 4.6 

5 106390 Hs5016B AA446964 Prostate stem ceB antigen 4.6 

126959 AA199653 ESTs; Moderately similar to III! ALU SUB 45 

131584 HS29117 X91648 Haptens mRNA for pur alpha extended 3 1 45 

104638 Hs50953 AA039481 ESTs 45 

125661 H50319 ESTs 45 

10 103171 Hs534726 X66733 alpha-1-antichymotrypsln 45 

103928 Hs.199160 AA280085 ESTs 45 

102899 Hs.75730 X06272 signal recognition particle receptor fd 45 

100892 Hs.180769 HG4557-HT4962 Small Nuclear Ribonudeoprotein U1 , Isnr 45 

106167 Hs.7956 AA425906 ESTs 45 

15 129404 Hs517584 AA172056 ESTs 45 

1069S0 Hs5475B AA521354 ESTs 45 

132316 Hs.44566 U28831 Human protein immuno-reactive with anth 4.4 

132056 Hs.38176 T89386 Homo sapiens mRNA for KIAA0606 protein; 44 

133718 Hs.188760 X15306 neurofilament; heavy polypeptide (200kD) 44 

20 101470 Hs.1846 M22898 tumor protein p53 (U-Fraumeni syndrome) 4.4 

131904 H&284296 AA143019 ESTs; Highly similar to surface 4 integr 4.4 

105804 Hs52514 AA383142 ESTs 44 

122861 Hs.119394 AA464428 ESTs 4.4 

111336 Hs59894 N79565 ESTs 44 

25 121944 H&98518 AA429278 ESTs 44 

1 34401 Hs51 1577 AA243746 ESTs; Highly similar to CGI protein [H & 44 

12645B H&288969 AA815252 ESTs; Weakly slmBar to !!D ALU SUBFAMI 44 

133435 Hs523966 T23983 ESTs; Moderately similar to fill ALU SUB 44 

105178 Hs51941 AA187490 ESTs 45 

30 127315 AA640B34 nr27b06 j1 NCLCQAP_Pr3 Homo sapiens cON 4.3 

132645 Hs54424 X87870 Haptens mRNA for hepatocyte nuclear fa 45 

116162 H&282990 AA461487 ESTs; Weakly similar to F52C125 [Celeg 4.3 

118040 Hs.47567 N52876 EST 45 

130008 Hs.278427 M31423 cerebellar degeneration-related protein 45 

35 126607 Hs.114688 W87424 ESTs 45 

123061 Hs.105130 AA482030 EST 45 

109391 Hs.184245 AA219699 ESTs 45 

109175 AA180498 ESTs 45 

127003 Hs.173540 AA550806 ESTs; Weakly similar to (define not ava 45 

40 102547 Hs.46638 U57911 chromosome 1 1 open reading frame 8 45 

134208 Hs.79993 U88871 peroxisomal biogenesis factor 7 45 

104258 Hs5462 AF007216 solute carrier family 4; sodium bkarbon 45 

130759 Hs. 18946 AA094720 ESTs; Weakly similar to (define not ava 45 

132160 Hs595923 AA281770 seven in absentia (DrosophBa) homolog 1 45 

45 135062 Hs53872 AA174183 ESTs 45 

126510 HS534762 R49702 ESTs; Weakly similar to KIAAD319 [H^api 42 

122055 HS58747 AA431732 EST 45 

133136 H&6574 AF007165 suppressm (nuclear deformed epidermal a 45 

109890 Hs50843 H04649 ESTs 45 

50 133294 Hs.69997 R79723 H.sapiens mRNA for transKn associated z 45 

134436 Hs.83190 S80437 fatty acid synthase {3* region} [human, 45 

107375 HS551064 U88573 NBR2 45 
122223 HS57413 AA436158 ESTs - 45 
103044 Hs548210 X55777 H^apiens Mahlavu hepatoceliular carcino 45 

55 120125 HS59815 W99362 EST 45 

128969 Hs583978 T65327 ESTs; Highly similar to (deffine not ava 45 

129637 Hs.1179 D90359 TATA box binding protein (TBP)-associatB 45 

106566 AA455921 ESTs; Weakly similar to !!!! ALU SUBFAMI 45 

112605 HS59852 R79220 ESTs 45 

60 103364 HS579929 X90872 H.sapians mRNA for gp25L2 protein 45 

132811 Hs57419 U25435 transcriptional repressor 45 

126570 Hs526292 T79274 ESTs 45 

116298 Hs.94109 AA489046 ESTs 45 

103024 Hs.105938 X53961 lactotransferrin 4.1 

65 129133 Hs.108850 R56728 yg95o6.r1 Soares infant brain 1NIB Homo 4.1 

133167 Hs.6641 N98707 kinesin family member 5C 4.1 

126871 HS.14051 AA351779 ESTs 4.1 

132333 Hs.45032 AA192157 ESTs 4.1 

107376 Hs.327179 U90545 solute carrier tamlly 17 (sodium phospha 4.1 
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128517 Hs.100861 AA280617 ESTs; Weakly similar to p60 tetania [H.S 4.1 

130555 Hs.116774 AA450324 ESTs 4.1 

105765 Hs.24183 AA343514 ESTs ' 4.1 

126529 Hs2G369 AA133237 ESTs 4.1 

5 125928 Hs.181889 H29730 ESTs 4.1 

117280 Hs.172129 N22107 ESTs;ModaratelysfrnJlarto!IJIALUSUB 4.1 

100234 Hs.3085 D29677 KIAA0054 gene product 4.1 

100959 Hs.11 8127 J00073 actin; alpha; cardiac musde 4.1 

107130 Hs.12913 AA620582 ESTs; Weakly similar to (defllna not ava 4.1 

10 105035 Hs.8859 AA128486 ESTs - 4.1 

126735 Hs-226795 AA808949 glutathione S-transferase pi 4.1 

113056 Hs,8036 T26471 ESTs; Moderately similar to till ALU SUB 4 

102460 Hs.211582 U46959 Homo sapiens myosin light chain kinase ( 4 

106968 H&26813 AA504631 ESTs; Weakly similar to (defline not ava 4 

15 123107 Hs.104207 AA486Q71 ESTs 4 

127256 H&267967 AA327550 ESTs; Weakly simitar to (HI ALU SUBFAMI 4 

105329 Hs22862 AA234561 ESTs 4 

115504 Hs.42736 AA291946 ESTs 4 

120726 Hs.97293 AA293656 ESTs 4 

20 103576 Hs.94550 Z26317 desmogletn2 4 

127889 Hs.144941 AI147408 ESTs 4 

106394 H&25320 AA447223 ESTs 4 

128046 AA873285 ESTs 4 

103391 Hs.1 14366 X94453 rjyrroline-5-cajboxylate synthetase (glut 4 

25 106448 Hs£7D04 AA449455 ESTs 4 

126513 H&86276 W27601 ESTs; Moderately similar to (define not 4 

129593 HsH8314 AA487015 ESTs; Weakly similar to !!!! ALU SUBFAMI 3.9 

110151 H$51608 H18836 ESTs 35 

105344 H&B645 AA235303 ESTs 3.9 

30 104791 Ha501871 AA029046 ESTs 35 

123442 Hs.1 11496 AA598803 ESTs 3.9 

127800 Hs.79428 AA521047 BCL2/adenoviius E1B 19kD-interacting pro 3.9 

114555 Hs.167904 AA058594 ESTs 3.9 

122138 Hs.163960 AA4355 49 ESTs 35 

35 129565 Hs.1 98726 X77777 vasoactive intestinal peptide receptor 1 3.9 

103471 Hs.75216 Y00815 protein tyrosine phosphatase; receptor t 3.9 

133908 Hs.325474 M83216 caidesmonl 3.9 

105635 Hs^01985 AA281508 ESTs 3.9 

134285 Hs.81056 AA460012 solute carrier famSy 22 (organic cation 35 

40 134125 Hs.50421 R38102 KIAAQ203 gene product 35 

125628 Hs241493 AA418069 natural kffler-tumor recognition sequenc 35 

103695 Hs.1 85600 AA018758 ESTs 35 

100642 Hs.182183 HG2743-HT3926 Cakiesmon 1 , AIL Splice 6, Non-Muscle 35 

104334 Hs.78771 D82614 ESTs 35 

45 110242 Hs.19978 H26417 ESTs 35 

12S298 Hs.269008 239255 ESTs 35 

104060 Hs503193 AA397968 zt87a9.r1 SoaresJesfc_NHT Homo sapiens 35 

105823 Hs393960 AA398197 ESTs 35 

126499 Hs.110445 AA315671 ESTs; Moderately similar to unknown [Mm 35 

50 130752 Hs.18895 D50927 KIAA0137 gene product 35 

123494 Hs.112110 AA599786 ESTs 35 

104846 H&32478 AA04O154 ESTs 35 

108921 Hs.71721 AA142913 ESTs - 35 

115506 Hs.45207 AA292537 ESTs 35 

55 100452 Hs.241552 D87742 Human mRNA for WAA0268 gene; partial cd 3.8 

104454 Hs.129228 M84443 galactokinase 2 35 

108730 Hs.102859 AA126254 ESTs 35 

131223 Hs24427 AA247788 ESTs; Highly similar to (defline not ava 35 

104784 H&269228 AA027055 ESTs 35 

60 104946 Hs.73848 AA069549 ESTs 35 

106932 Hs.9394 AA495926 ESTs 35 

101724 Hs.620 M69225 bullous pemphigoid antigen 1 (230/240kD) 35 

106140 Hs.14912 AA424524 Homo sapiens mRNA tor KIAA0286 gene; par 35 

128135 H&269721 AA913491 ESTs 35 

65 120030 Hs.58694 W92051 ESTs 3.8 

126457 Hs.50382 AA007489 zh98g04.r1 SoamsJetaLfiverjspteenJNF 3.8 

123917 Hs.112969 AA621311 EST 3.7 

110714 Hs.17752 H95978 Homo sapiens phosphauoVlserine-specific 3.7 

130577 Hs.162 M35410 InsuQn-fike growth factor binding prote 3.7 
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117667 
126104 
100379 
115646 
125792 
102162 
128530 
119940 
110769 
132914 
113594 
103702 
130780 
123288 
120691 
103153 
129201 
114798 
126801 
105503 
104260 
125980 
123255 



Hs.44708 

HS39712 

H&27B721 

Hs.305971 

Hs.193700 

Hs.1592 

Hs.183475 

H&272531 

Hs.23837 

Hs.60293 

Hs.15683 

H&279952 

Hs.19347 

H&291025 

Hs.22380 

Hs.75295 

Hs.109390 

Hs54900 

Hs.7337 

Hs51707 

Hs.194283 



106709 
127858 
101964 
105508 
116844 
105372 
100745 
127521 
110758 
107307 
133200 
114774 
120265 
134359 
116250 



131898 
133444 
126232 
135357 
457951 
108407 
126659 
104189 



103026 
133011 
131379 
126742 
105560 
118472 
105623 
120262 
105027 
130760 
117473 



Hs.105273 
Hs.6363 
Hs.121686 
Hs/166994 

Hs.302738 
Hs.75511 

Hs.48428 
Hs.68554 
H&22983 
Hs.170291 
Hs27973 

Hs^26416 

Hs.337434 

Hs.142298 

Hs.144630 

Hs.164018 

HS274265 

Hs.44155 

Hs.183639 

Hs.184325 

HS270696 

Hs.199067 

Hs44829 

H&35841 

Hs.279780 

Hs.73793 

Hs334641 

Hs.78572 



Hs.301804 

Hs.129014 

Hs.79386 

Hs.171921 

HS26176 

Hs.169359 

Hs306915 

Hs.42179 

HS30127 

Hs.145807 

HS26771 

Hs.18953 

Hs.155560 



N39214 ser-Thr protein kinase rotated to me my 37 

N77278 ESTs; Weakly simBar to BONE/CARTILAGE P 3.7 

D82060 Homo sapiens mRNA for membrane protein w 3.7 

M404352 ESTs 3.7 

M005388 ESTs; Moderately similar to till ALU SUB 3.7 

U18291 C0C16 (cell division cycte 16; S.cerevi 3.7 

AA504343 ESTs; Moderately similar to IH! ALU SUB 3.7 

W86779 EST 3.7 

M22222 yw34b06.s1 Morton Fetal Cochlea Homo sap 3.7 

AA496037 ESTs 3.7 

T92030 ESTs 3.7 

AA027793 ESTs; Highly similar to (deffine not ava . 3.7 

AA248406 ESTs 3.7 

AA495836 EST 3.7 

AA291173 ESTs 3.7 

X66534 guanyfate cyclase 1; soluble; alpha 3 3.7 

H19989 ESTs 3,7 

AA159181 ESTs 3.7 

AA512902 ESTs 3.7 

AA256616 ESTs 3.7 

AF008192 Homo sapiens putative QR6 protein (GR6) 3.7 

R97219 ESTs 3.7 

AA480890 ESTs 3.6 

AA206625 ESTs 35 

HG3162-HT3339 Transcription Factor lia 3.6 

X87241 FATturr»rsuppn3Ssor(Drosc}phto)hon»lo 3.6 

Y10511 Rsaptens mRNA for C0176 protein 3.6 

W15263 ESTs 3.6 

M92934 connective tissue growth factor 3.6 

T97307 ESTs; Moderately simSarto fill ALU SUB 3.6 

N59800 EST 3.6 

C20780 EST 3,6 

AA400517 ESTs; Moderately similar to UDP-GLUCOSE; 3.6 

AA464696 ESTs 3.6 

AA806365 oc26h07.s1 NCI_CGAP_GCB1 Homo sapiens cD 3.6 

S81578 dioxffwesponsive gene {putative polyade 3.6 

AA256680 ESTs 3.6 

H64938 ESTs 3.6 

AA236481 ESTs 35 

HG3510-HT3704 V-Erba Related Ear-3 Protein 3.6 

AA809982 ESTs 3.6 

N21365 talin 3.6 

T52099 creatine kinase; mitochondrial 2 (sarcom 3.6 

AA432248 ESTs 3.6 

AA150043 ESTs 3.6 

AA173759 ESTs; Moderately similar to fill ALU SUB 3.6 

M34309 ven>b2 avian erythroblastic leukemia v 3.6 

AA480975 ESTs; Moderately simSar to III! ALU SUB 3.6 

AA438459 nuclear factor i/X (CCAAT-cinding transc 3.6 

N52232 ESTs 3.6 

M27281 vascular endothelial growth factor 3.6 

H06296 ESTs 3.6 

AA235803 ESTs - 3.5 

A1369384 aryisulfatase D 35 

AA075519 zm87h9.s1 Stratagene ovarian cancer (#93 35 

T16245 atfsintegrinarrfrratafloprotemasedorna 35 

AA485B05 ESTs 35 

N53276 ESTs 35 

X54162 HumanmRNAfora64Kdautoantigenexpre 35 

AA042990 sema domain; Immunoglobulin domain (Ig); 35 

R49035 ESTs 35 

H64106 yrS7e06.r1 Soares fetal liver spleen 1NF 3.5 

AA262783 ESTs 35 

N66818 ESTs 35 

AA280895 ESTs; Highly similar to HI! ALU SUBFAMI 35 

AA172076 ESTs; Moderately similar to HI1 ALU SUB 35 

AA126472 ESTs 35 

AA128997 phosphodiesterase 8A 35 

N30157 ESTs 35 
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102653 Hs.168075 U70322 karyopherin (importin) beta 2 35 

126349 Hs.13531 AM42868 ESTs; Weakly similar to (deffine not ava 35 

132154 Hs.41119 N67170 ESTs 35 

131689 Hs.30696 AA599653 transcription factor-like 5 (basic helix 35 

5 127862 Hs.163191 AA765305 EST 35 

126995 Hs.189810 W26950 Human DNA sequence from PAC 388M5 on chr 35 

119071 R31180 ESTs 35 

103941 Hs.96593 AA282978 ESTs 35 

110721 H&31319 H97678 ESTs 35 

10 '126586 Hs.43086 AA011247 ESTs 35 

103106 Hs.1857 X62025 phosphodiesterase 6G; cGMP-spedfic; rod 35 

116357 HSX0797 AA504806 Homo sapiens done 23620 mRNA sequence 35 

105309 Hs4104 AA233790 ESTs 35 

130786 Hs.19525 R39390 ESTs 35 

15 109101 HS52184 AA16770B ESTs 35 

103134 H&2839 X65724 Nome disease (pseudogtioma) 35 

131798 Hs501449 X86098 adenovirus 5 El A binding protein 35 

118535 Hs.49418 N6796B ESTs 35 

102592 Hs.11223 U62389 Human putative cytosolic NADP-dependent 3.4 

20 125905 Hs5456 T69868 chaperonmoonlairdrigTCP1;subuntt2(b 3.4 

109160 Hs501997 AA179387 ESTs 34 

105327 H&211593 AA234440 ESTs 34 

106586 HS57787 AA45659B ESTs 3.4 

122635 AA454085 EST 34 

25 132413 H&2601 16 AA132969 rnetafloprotaasa 1 (pitrfiysfn family) 34 

131938 HS34956 AA283620 ESTs 34 

133871 Hs.182793 AA454597 ESTs 3.4 

107175 Hs£92503 AA621751 ESTs; Weakly similar to WAA0601 protein 34 

101188 Hs.184298 L20320 cycflrvdependent kinase 7 (homoiog of Xe 34 

30 126422 H&237658 H48518 ESTs; Highly similar to apoOpoprotein A 34 

118475 N66845 ESTs; Weakly similar to llll ALU CLASS B 34 

104558 HS58959 R56678 ESTs; WeaWy similar to!!!! ALU SUB FAMl 34 

128307 Hs.1 32005 AI453794 ESTs 3.4 

112254 Hs35829 R51831 ESTs 34 

35 125408 H&89578 N72353 yv37e12j1 Scares fetal Ever spleen 1NF 34 

109834 Hs.175955 H00604 ESTs 34 

130844 Hs£0191 D12122 seven in absentia (Drosophiia) homoiog 2 34 

127143 H&20843 AA533553 n}68h04^1 NCLCGAP_Pr10 Homo sapiens cD 34 

135309 Hs.42500 D25984 ESTs 34 

40 125724 Hs.295978 AA083407 stimulated trans-acting (actor (50 kDa) 34 

127692 Hs.187983 AI021912 ESTs 34 

116674 Hs52127 F04816 ESTs 34 

134700 HS5868 AA481414 gotgi SNAP receptor complex member 1 34 

114846 Hs.1 661 96 AA234929 ESTs 34 

45 103649 Hs.155983 Z70219 frisaplens mRNA for 5*UTR for unknown pro 34 

134835 Hs.89925 L04569 calcium channel; voltage-dependent; Lty 34 

130568 Hs.16085 AA232535 ESTs; Highly similar to (def&ie not ava 3.4 

111331 Hs.15978 N78773 ESTs 34 

106036 Hs.1 0653 AA412505 ESTs 34 

50 130987 Hs.21893 R45698 ESTs 34 

112614 HS55828 R98192 ESTs 34 

127815 Hs255015 AA676009 ob93c10.s1 NCI.CGAP.GCBI Homo sapiens cO 34 

100144 Hs.75616 D13643 KIAA001 8 gene product - 34 

101129 HS247892 L10405 Homo sapiens DNA binding protein for sur 34 

55 130874 Hs.20621 T08287 ESTs 34 

106882 Hs.26994 AA489009 ESTs 34 

103855 Hs5Q2267 AA195179 ESTs 34 

125957 H45213 yo03b08j1 Soares adult brain N2b5HB55Y 3.3 

114048 Hs.146085 W94613 ESTs 3.3 

60 109826 Hs.75354 F13702 ESTs 35 

125355 Hs.170098 R45630 ESTs; Highly similar to K1AA0372 [Rsapi 3.3 

104182 Hs.143792 AA479990 ESTs; WeaWy similar to glioma amplified 3.3 

100294 Hs.75454 D49396 Human mRNA for Apo1_Human (MER5(Aop1-Mou 35 

131688 Hs50692 U24153 p21 (CDKN1A)-activated kinase 2 3.3 

65 116256 Hs58201 AA481256 ESTs; Weakly similar to (deffine not ava 35 

102034 H&230 U05291 feromodulin 35 

130072 Hs.14658 R99606 Human chromosome 5q13.1 done 5G8 mRNA 35 

1 14615 Hs.159456 AA083812 ESTs; Highly similar to (deflme not ava 35 

128707 Hs.104105 AA136474 Meis (mouse) homoiog 2 3.3 
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115048 Hs.190057 AA252668 ESTs 33 

125852 HS31110 H12084 ESTs 33 

135142 HS34192 R31679 ESTs 33 

103119 Hs3877 X63629 cadhertn 3; P-cadherin (placental) 33 

5 104460 Hs.62604 M91504 ESTs 33 

100365 Hs79284 D78611 mesoderm specific transcript (mouse) horn 33 

131524 HS301804 N39152 ESTs 33 

102165 Hs.159627 U18321 Death associated protein 3 33 

126966 Hs.182575 R38438 solute carrier famfly 15 (H-^eplide tra 33 

10 124839 Hs.140942 R55784 ESTs 33 

100709 Hs.100469 HG3264-HT3441 Att (GMJQ2478) 33 

132967 Hs.61635 AA032221 Homo sapiens BAC done RG041D1 1 from 7q2 33 

102927 Hs.65114 X12876 keratin 18 33 

132616 Hs383558 AA38S264 ESTs 33 

15 125132 Hs.129781 W15495 ESTs 33 

111225 HS31652 N68989 ESTs 33 

114956 HS37113 AA243681 ESTs 33 

122235 Hs.1 12227 AA436475 ESTs 33 

112325 Hs.12315 R56055 ESTs 33 

20 123360 Hs.178604 AA504784 ESTs 33 

105150 Hs.155995 AA169640 Homo sapiens mRNA for KIAAQ643 protein; 33 

107391 Hs384294 W02877 ESTs 33 

113058 Hs.7569 T26893 EST 33 

134371 Hs.82318 S69790 Brush-1 33 

25 125669 Hs.333256 R51308 ESTs; Moderately similar to 1!!! ALU SUB 33 

111506 Hs394105 R07726 ESTs 33 

122974 Hs.194215 AA478625 ESTs 33 

102369 H&299867 U39840 hepatocyte nuctear factor 3; alpha 33 

120408 Hs.190151 AA235045 ESTs 33 

30 117993 Hs.47402 N52039 . ESTs; Weakly similar to till ALU SUBFAMI ' 33 

129586 Hs.1 1500 AA437118 ESTs 33 

128138 Hs.126494 AI200825 ESTs 33 

127265 AA332751 EST37214 Embryo, 8 week I Homo sapiens c 33 

107674 Hs.41143 AA011027 Homo sapiens mRNA for KIAA0581 protein; 33 

35 104866 H$393691 AA045342 ESTs 33 

103427 Hs350655 X97303 H^apiens mRNA for Ptg-12 protein 33 

132990 HS334334 AA458761 ESTs 33 

127017 HS351946 AA740146 ESTs 33 

132313 Hs.44481 U 13220 forkhead prosophBaHke 6 33 

40 106880 HS32425 AA488889 ESTs 33 

107039 Hs.169780 AA599751 homologous to yeast nitrogen permease (c 33 

120870 Hs392581 AA357172 ESTs 33 

107920 HS384207 AA027951 ESTs 33 

104165 Hs.105116 AA459160 EST 33 

45 107012 Hs.63908 AA598745 ESTs 33 

103605 Hs.194657 Z354Q2 H.sapiens gene encoding E-cadherin, exon 33 

124006 Hs370016 D603Q2 ESTs 33 

101300 Hs.74137 L40391 Homo sapiens (done s153) mRNA fragment 33 

101183 Hs.795 L19779 H2A hlstone family; member 0 33 

50 125596 R25698 yg44h1 1 XI Scares mtant brain 1 NIB Homo 33 

127261 AA661567 nu86b02.s1 NCI JJGAPAM Homo sapiens cD 33 

120090 Hs.59554 W94591 ESTs 33 

129393 Hs.166982 D13435 phosphatktyfinositol glycan; class F . 33 

120923 Hs.97129 AA382283 ESTs 33 

55 118907 HS374256 N91003 ESTs 33 

111552 HS.1911B5 R09411 ESTs 33 

104431 Hs.99913 J03019 adrenergic; beta-V; receptor 33 

133551 Hs378634 D63480 Human mRNA for K1AAD146 gene; partial cd 33 

131615 Hs.192803 D14533 xeroderma pigmentosum; compiemBntatton g 33 

60 126547 HS34072 U47732 transmembrane 4 superfamily member 3 33 

103172 Hs.116774 X68742 fntegrin; alpha 1 33 

113867 HS34095 W68845 ESTs 33 

133323 Hs.70937 Z83735 H3 hlstone family; member K 33 

111597 Hs.189716 R11499 ESTs 33 

65 121515 Hs.104696 AM12133 ESTs 33 

107445 Hs.6639 W28406 ESTs 33 

106887 HS334335 AA489091 ESTs 33 

123052 Hs.185766 AA481806 ESTs 33 

107072 Hs.130760 AA6091 13 Homo sapiens mRNA; cDNA DKFZp586N0318 {f 33 
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1 02214 H&32964 U23752 SRY (sex-determining region Y)-box 11 32 

123147 AA487961 ab11h6.s1 Stratagene lung (#93721) Homo 32 

125435 H&272138 R00940 ye87g03.r1 Soares fatal liver spleen 1NF 32 

116246 Hs.250646 AA479961 ESTs; Highly similar to ubiqutttn-conjug 32 

5 105169 Hs.180789 AA180321 Homo sapiens (done S164)mRNA; 3* end o 32 

134001 H&78344 AF001548 myosin; heavy polypeptide 1 1 ; smooth mus 32 

124866 H&304389 R68571 ESTs 32 

133205 Hs.67619 AA089559 Homo sapiens mRNA; chromosome 1 specific 32 

102986 Hs.182378 X17648 colony stimulating factor 1 (macrophage) 32 

10 101232 Hs242894 L28997 ADP-ribosytation factor-Uke 1 3.1 

132906 Hs234896 AA142857 ESTs;Higr^simkrtogerninin[H^apie 3.1 

104281 Hs£669 C14290 ESTs 3.1 

123926 Hs227933 AAS21348 ESTs; Highly similar to (dafline not ava 3.1 

134464 HS239720 N79354 ESTs; Weakly similar to Rga (Djnelanogas 3.1 

15 105322 Hs.16346 AA234100 ESTs 3.1 

100631 Hs.48332 HG27094fT28G5 SenWTrueonme Kinase (Gfa225431) 3.1 

130791 Hs.199263 AA259102 ESTs; Highly similar to (define not ava 3.1 

131220 Hs.300855 R77200 ESTs 3.1 

113237 Hs.123642 T62857 ESTs 3.1 

20 125562 Hs*8968 AJ484372 ESTs 3.1 

134110 Hs.79136 U41060 Human breast cancer; estrogen regulated 3.1 

132393 Hs.47334 W85888 ESTs; Moderately similar to ill! ALU SUB 3.1 

107439 Hs296842 W27995 ESTs; Moderately similar to non-muscle m 3.1 

125863 Hs.40719 AA299096 Homo sapiens mRNA; cDNA DKFZp564M0916 (f 3.1 

25 105811 HS286192 AA394121 ESTs 3.1 

129284 HS296141 AA104023 ESTs 3.1 

125321 Hs.178294 TB6652 ESTs 3.1 

107332 Hs.183297 T87750 ESTs 3.1 

123570 Hs.109653 AA608955 ESTs 3.1 

30 100384 H&90800 D83646 matrix metalioprotemase 16 (membrane-in 3.1 

109063 H&38972 AA161043 tetraspan! 3.1 

133284 Hs.182828 U09367 zinc finger protein 136 (clone pHZ-20) 3.1 

131839 Hs33010 H80622 Homo sapiens mRNA for KIAA0633 protein; 3.1 

117606 Hs.44698 N35115 ESTs 3.1 

35 418998 HS287B49 F13215 ESTs 3.1 

125180 Hs.103120 W58344 ESTs 3.1 

100769 HG3893-HT4163 Phosphoglucomutase 1, Aft. Splice 3.1 

126017 Hs.159440 H60487 ESTs 3.1 

132452 Hs247324 AA005262 Homo sapiens DNA sequence from PAC 282D1 3.1 

40 129077 Hs.108479 H78886 ESTs 3.1 

126563 Hs.181368 W26247 U5 snRNP-specffic protein (220 kD); orth 3.1 

129650 Hs.1 16256 N52554 ESTs 3.1 

123465 AA599033 ESTs &1 

126486 Hs.152316 AA345339 EST51345 Qali bladder 11 Homo sapiens cD 3.1 

45 126460 Hs.167031 W01616 za36d05j1 Soares fetal Ever spleen 1NF 3.1 

118697 Hs.43234 N72094 ESTs 3.1 

103860 H&38057 AA203742 ESTs 3.1 

127968 H&124347 AA971439 ESTs 3.1 

124984 HS223241 T47566 yb15c1U1 Stratagene placenta (#937225) 3.1 

50 103903 Hs.1 5220 AA249334 |312seq.F Human fetal heart, Lambda ZAP &1 

106697 HS22242 AA463737 ESTs 3.1 

130892 HS20993 AA442604 ESTs; Weakly similar to Ydr374cp [S.cere 3 

114032 H&35014 W92779 ESTs - 3 

128835 Hs.106390 W15528 ESTs 3 

55 103867 HS247815 Z80788 Ksapiens H4/I gene 3 

126264 Hs250614 N42897 yy13h06.r1 Soares melanocyte 2NbHM Homo 3 

132626 HS21275 D25755 ESTs 3 

131107 Hs.75354 N87590 ESTs 3 

126760 H&5811 R12421 ESTs 3 

60 127363 HS22116 AA307744 Homo sapiens Cdc14B1 phosphatase mRNA; c 3 

103690 H&84063 AA016186 ESTs 3 

102589 Hs£867 U62015 Homo sapiens Cyr61 mRNA, complete cds 3 

125144 HS24336 W37999 ESTs 3 

132977 H&301404 U28686 RNA binding motif protein 3 3 

65 120714 Hs.146170 AA292689 ESTs 3 

101038 Hs.76411 J05249 replication protein A2 (32kD) 3 

102856 HS248177 X00090 Human histone H3 gene 3 

105516 Hs.30738 AA257871 ESTs 3 

131137 H&33287 U85193 nuclear factor l/B 3 
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127221 H&241551 AJ354332 ESTs 3 

411888 Hs£4104 R26703 ESTs 3 

131684 Hs3066 U26174 granzyme K (serine protease; granzyme 3; 3 

100629 H&21291 HG2706-HT28Q2 Serine/Threonine Kinase (Gb225428) 3 

5 119944 HJL58915 W86838 EST 3 

113801 Hs.118281 W38418 zinc finger protein 266 3 

133780 Hs.76152 M14219 decorin 3 

104690. Hs.14449 AA010889 ESTs 3 

126371 Hs.304139 N57645 EST 3 

10 127635 Hs.116346 AA766903 ESTs 3 

128434 Ks. 143880 M190914 ESTs 3 

435761 Hs.187555 AA701941 ESTs 3 

125025 Hsj50748 T71561 ESTs 3 

124940 Hs.1 03804 R99599 heterogeneous nuclear ribonudeoprotein 3 

15 128742 H&251531 D00763 prateasome (prasorne; macropain) subunit; 3 

107147 Hs.10450 AA621125 Homo sapiens chromosome 2; 10 repeat reg 3 

112068 HS52545 R43910 ESTs 3 

105346 Hs363727 AA235465 ESTs; Moderately similar to 111! ALU SUB 3 

130972 HS21739 AA370302 Homo sapiens mRNA; cDNA DKFZp586l1518 (f 3 

20 131230 H&274407 AA14S987 thymus specific serine peptidase 3 

133743 Hs.75847 N79435 ESTs 3 

127402 Hs£27949 AA358869 ESTs;Hightysjni5lartoSEC13-RELATEDPR 3 

117483 Hs.44189 N30426 ESTs 3 

123659 Hs.112699 AA609368 ESTs 3 

25 103963 Hs.63290 AA298588 EST114219HSC172 ceils il Homo sapiens c 3 

103795 HS.7367 AAt 12222 ESTs; Moderately similar to (defline not 3 

115092 Hs.80975 AA255903 CD39-like4 25 

134831 H&59890 S72370 pyruvate carboxylase 2.9 

128579 Hs.101810 AA093378 ESTs; Weakly similar to OH ALU SUBFAMI 2.9 

30 134193 Hs.7980 FO9570 ESTs 25 

123522 Hs.112575 AA608577 ESTs 2.9 

107109 Hs.32793 AA609943 ESTs 2.9 

134694 H&68556 D50405 histone deacetylase 1 2.9 

134399 Hs.82689 H99801 tumor rejection antigen (gp96) 1 2.9 

35 134632 Hs.174139 AA398710 H. sapiens RNA for CLCN3 2.9 

106683 Hs.14512 AA461495 ESTs 2.9 

108555 AA084963 zn13e12.s1 Stratagene hNT neuron (#93723 2.9 

100953 Hs3110 HG945-HT945 NudeteArid-Biriding Protein (Qb:L12693) 25 

130597 Hs.16492 AA173998 ESTs; Weakly similar to weakly similar t 2.9 

40 101813 Hs.139226 M57338 replication factor C (activator 1) 2 (40 2.9 

106636 HS286 AA459950 ESTs 2.9 

129109 Hs.108708 AA491295 caJdurrtfceJrrodulinKieperident protein kin 19 

125819 H&251871 AA044640 stroma! cell-derived factor 1 2.9 

106282 Hs.9857 AA433946 ESTs* Weakly similar to (defline not ava 19 

45 100386 Hs«301636 D83703 peroxisomal biogenesis factor 6 19 

114546 Hs58D74 AA056263 ESTs; Moderately similar to !IU ALU SUB 25 

105914 H&9701 AA402224 Homo sapiens growth arrest and DNA-damag 19 

108552 AA084912 zn11c7.s1 Stratagene hNT neuron (#937233 19 

126505 Hs.190057 W26894 16a11 Hurr^ retina cDNA randomly primed 25 

50 134098 Hs.79086 X06323 Human MRL3 mRNA for hbosoma! protein L3 25 

129721 HS511539 L19161 eukaryotictranslatfaninffiation factor 19 

100076 Hsl277422 AB000897 Homo sapiens mRNA for cadherin F1B3, par 25 

117466 Hs.44104 N29862 ESTs - 25 

106335 H&36688 AA437258 ESTs; Moderately similar to WAP four-dis 25 

55 134510 Hs.250870 U25265 protein kinase; rnftogen-adivated; kinas 19 

105835 HSJ32995 AA398412 ESTs 25 

106611 HS26267 AA458904 ESTs; Weakly similar to torslnA [H^apie 19 

134087 Hs.173824 U51166 thymlne-DNAglycosyfase 2.9 

100641 HS.1821B3 HG2743-HT2846 Caldesrnon 1, AIL Splice 4, Non-Muscle 19 

60 104602 R86920 ESTs 25 

117203 Hs.42738 H99799 ESTs 25 

131889 Hs.34073 AA401912 BH-protocadherln (brain-heart) 2.9 

101707 Hs.155212 M65131 methylrnalonyl Coenzyme A mutase 25 

115271 HS5724 AA279422 ESTs 25 

65 125812 HS287912 H73420 lectin; mannosfrbinding; 1 25 

110740 Hs.19762 H99675 ESTs 25 

103406 Hs285728 X95677 Rsaptens mRNA for ArgBPlB protein 25 

. 104577 Hs.132390 R71539 ESTs 25 

102772 Hs.161002 U83115 absent in melanoma 1 25 
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131710 Hs5Q985 AA233225 ESTs; Highly similar to (define not ava 2.9 

125231 Hs.268903 W84714 ESTs 2.9 

127380 Hs.15535 AI417137 Homo sapiens clone 24582 mRNA sequence 2.9 

104229 Hs.61289 ABO02346 • inositol phosphate ff-phosphatase 2 (syn 2.9 

5 126600 Hs.191385 AA699949 ESTs 2.9 

125175 HS.3Q303O W52355 EST 2.9 

103849 Hs.34576 AA187045 ESTs; Weakly similar to ALU SUBFAMI 2.9 

102126 Hs.78961 U14575 protein phosphatase 1; regulatory (Inhlb 2.9 

124906 Hs.107815 R87647 ESTs 2.9 

10 131148 Hs503125 C00038 ESTs 25 

123158 Hs.218329 AA48865B heat shock 70kD protein 1 2.9 

133667 Hs.75462 U72649 Human BTG2 (BTG2) mRNA; complete cds 25 

105182 Hs.18271 AA191014 ESTs; Weakly similar to Ydr372cp [S.cere 25 

133968 HS232068 D15050 Human mRNA for transcription factor AREB 2.9 

15 117425 Hs536901 N27154 ESTs 2.9 

111087 HS-37637 N59645 ESTs 25 

129641 Hs.11805 N66066 ESTs 25 

128639 Hs.102897 N91246 ESTs 25 

133209 Hs.79265 AA114183 ESTs; Moderately similar to gtutamate py 2.9 

20 135154 Hs267812 AA126433 sorting nexin 4 25 

126838 HS279609 AA858097 pigment epHheOum-denved factor 25 

103803 Hs.106149 AA127696 ESTs 25 

102139 Hs5128 U15932 dual specificity phosphatase 5 25 

128104 AA971000 op67g1U1 Soares_Na_T_GBC_S1 Homosapi 25 

25 127834 Hs.337631 AA761415 nz22d08.s1 NCLCGAPJ5CB1 Homo sapiens cD 25 

133101 Hs.180952 AA488230 ESTs 25 

127250 H&217916 AI023717 ESTs 25 

135063 Hs.93883 D10537 myelin protein zero (Charcot-Marie-Tooth 2.8 

126323 Hs.68644 N45014 yy80g06.r1 Soares_rnultiplB_sclerosis_2Nb 25 

30 121873 Hs.145696 AA426270 ESTs 25 

122090 Hs.98884 AA432141 ESTs 25 

118728 HS.322645 N73705 ESTs 25 

135400 Hs.99915 M23263 androgen receptor (dihydrotestosterone r 25 

125278 Hs.129998 W93523 ESTs 25 

35 124387 Hs.109019 N27637 ESTs 25 

124603 Hs.12186 R45480 cydinK 25 

H45968 Hs.32149 H45968 ESTs 25 

104261 Hs5405 AF008442 RNA polymerase I subunit 25 

105366 Hs.282093 AA236356 ESTs 2.8 

40 106070 Hs.5957 AA417761 Homo sapiens done 24416 mRNA sequence 25 

131356 Hs55960 M 13241 v-myc avian myetocytomarosis viral relat 25 

112009 Hs£6255 R42714 EST 25 

133199 H&250175 AA609773 Homo sapiens done 23904 mRNA sequence 25 

110379 H&33130 H44825 ESTs 25 

45 103890 Hs.72085 AA236843 ESTs; Weakly similar to unknown [S.cerev 25 

128152 R20353 yg20f1O.rt Soares infant bram 1NIB Homo 25 

107008 Hs^3740 AA598710 ESTs 25 

135243 HS57101 AA215333 ESTs 25 

103058 Hs.184510 X57348 stratifin 25 

50 132020 H&293845 AA428990 ESTs 25 

116354 H&292566 AA504262 ESTs 2.8 

125867 Hs.12372 H98141 ESTs 25 

120603 Hs.98541 AA262787 ESTs; Highly similar to {deflirte not ava - 25 

115119 Hs.46847 AA256524 Human DNA sequence from done 30M3 on ch 25 

55 133865 Hs.170290 F09315 discs; large prosophSa) homobg 5 25 

109415 Hs.110826 AA227219 Homo sapiens CAQF9 mRNA; partial cds 25 

128687 Hs.23767 Z38910 ESTs 25 

109984 Hs.10299 H09594 ESTs; Moderately similar to lilt ALU SUB 25 

133179 Hs.66731 U81599 homeoboxB13 25 

60 115998 Hs.336629 AA448488 ESTs; Weakly similar to zinc finger prot 25 

112180 Hs^5067 R49116 EST 25 

120428 Hs.173694 AA236822 ESTs; Moderately similar to (defline not 2.8 

106241 Hs.6019 AA430108 ESTs 25 

131060 HS52564 AA160890 myosin VI 2.8 

65 111383 Hs.40919 N94527 ESTs 2.8 

102123 Hs.1594 U14518 centromere protein A (1 7kD) 25 

1Q2722 Hs.79981 U79242 Human done 23560 mRNA sequence 25 

129887 Hs274324 W92041 PCAF associated factor 65 alpha 25 

126663 Hs.181297 AA714635 ESTs 25 
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104387 Hs.134342 H17438 
107316 Hs.193700 T63174 
126059 Hs.145098 AA972446 
124447 N48000 
111398 Hs.125565 R00086 
134085 Hs.79018 U20979 
124788 Hs.100912 R43543 
112248 Hs326416 R51361 
121309 Hs.97312 AA402482 
103076 Hs.75319 X59818 
107071 Hs.35198 AA609053 
104425 Hs.35380 H88496 
132991 HS.62245 AA446906 
104968 HJL29669 AA084602 
121153 Hs.97694 AA399640 
131216 Hs.243901 D31058 
109682 H&22869 F09299 
131990 Hs.168818 H77734 
132027 Hs.181444 N78844 
127383 Hs.180478 AA447990 
132598 Hs£30 M81379 
101121 Hs.1313 L09753 
123000 H3.105640 AA479347 
121329 Hs.1755 AM04324 
100481 Hs.121489 HG1098-HT1098 
113803 Hs.283683 W42789 
110934 Hs.169001 N48708 
432888 T86823 
121802 Hs.188898 AA424328 
130396 Hs.155313 AB002331 
121103 Hs.97697 AA398936 
131129 HS23240 R27296 
130943 Hs.272429 D50855 
134676 Hs.87819 W28051 
111900 Hs.25318 R39044 
106025 Hs.173334 AA412053 
126144 Hs.40639 N39696 
103248 Hs.75262 X77383 
127230 Hs.274170 H30501 
101584 HS.84072 M35252 
124131 Hs.167489 H19980 
129689 Hs.77873 AA130156 
132892 HsJ}973 W92797 
120827 Hs.132967 AA347717 
134579 Hs.85963 N23222 
106149 H&256301 AA424881 
132037 Hs.332541 AA203649 
130542 Hs.179825 U64675 
122851 Hs.99598 AA463627 
134983 Hs.196384 D28235 
120537 Hs.160422 AA262790 
131036 Hs.174140 X64330 
133889 Hs211582 AA099391 
128847 Hs.106529 AM24199 
112755 Hs306044 R93802 
423239 AA323591 
105031 Hs.12321 AA127240 
126021 Hs.187516 AA775894 
102116 U13706 
133394 Hs237225 R16759 
104267 Hs.278439 C00358 
107614 Hs.40241 AA004878 
129809 Hs.1259 X55283 
112109 Hs^83309 R45221 
T85681 

109494 Hs.43899 AA233702 
118696 H&292284 N72086 
106053 Hs.36727 AA416963 
104440 Hs.284380 L20492 
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ESTs; Weakly similar to seventransmembra 
ESTs; Moderately similar to fill ALU SUB 
ESTs 
ESTs 

deafness; X-finked 1; progressive 

chromatin assembly factor I (150 kDa) 

ESTs 

ESTs 

ESTs 

ribonucleotide reductase M2 polypeptide 

ESTs 

ESTs 

solute carrier family 25 (mitochondrial 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs; Moderately similar to roundabout 1 
ESTs; Weaidy similar to R12C12.6 [Oeleg 
ESTs 

collagen; type tV; alpha 3 (Goodpasture 

tumor necrosis factor (Bgand) superfaml 

ESTs 

ESTs 

CystaSnD 

ESTs 

ESTs; Weakly similar to cytochrome P450 

ESTs 

ESTs 

Human mRNA for KIAA0333 gene; partial cd 
ESTs; Weakly similar to (define not ava 
ESTs 

calcium-sensing receptor (rrypocalduric 
ESTs; Weakly similar to keratin 9; cytos 
ESTs 
ESTs 

yx92a07j1 Scares melanocyte 2NbHM Homo 



Homo sapiens Opa-interacting protein OiP 

transmembrane 4 supertamiiy member 3 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs; Moderately similar to !!!! ALU SUB 
ESTs 

ESTs; Weakly similar to HEM45 [Haptens 
Human sperm membrane protein BS-63 mRNA, 
ESTs 



ESTs 
ATP citrate lyase 
ESTs 

zv81e01.r1 ScaresJotaLfetus_Nb2HF8_8w 
ESTs 

E5T26392 Cerebellum H Homo sapiens cDNA 

ESTs 

ESTs 

Human ELAV-Bke neuronal protein 1 isofo 
ESTs; Weakly similar to (defline not ava 
ESTs 

ESTs; Highly similar to (defline not ava 

asialogtycoprotein receptor 2 

ESTs; Weakfy similar to !!il ALU SUBFAMI 

yd60c06.r1 Scares fetal Gver spleen 1NF 

ESTs 

Homo sapiens RNA polymerase III largest 
ESTs; Highly similar to histone H2A [H.s 
gamrr^lutarnyltransferase 1 
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129426 Hs.1 11323 AM 12087 EST; Highly simitar to (defiine not aval 27 

123798 AA620411 small Inducible cytokine A5 (RANTES) 2.7 

106716 HS238928 AA464962 ESTs 27 

103663 Z78291 Z78291 Homo sapiens brain fetus Homo sap 2.7 

5 114162 H&22265 Z38909 ESTs 2.7 . 

113063 Hs£G27 T32438 ESTs 2.7 

127897 AA773857 af80c09.r1 SoaresJIhHMPu.SI Homo sapiens 27 

130621 Hs.16803 AA621718 ESTs; Weakly similar to (defiine not ava 2.7 

1 16245 Hs.42796 ' AA479958 ESTs; Highly similar to (defiine not ava 2.7 

10 125499 R11878 yf49d1 1 .rl Soares Infant brain 1 NIB Homo 27 

133960 Hs.77899 M19267 tropomyosin 1 (alpha) 27 

104470 Hs.246358 N28843 ESTs; Weakty similar to Similar to coBa 27 

134982 Hs.92308 N46086 ESTs 27 

106803 Hs.284295 AM79114 ESTs 27 

15 104899 HS285574 AA054726 ESTs 2.7 

125401 Hs^37585 AI204637 ESTs; Moderately similar to KIAA0350 [H. 27 

111253 Hs.15768 N70042 ESTs; Moderately similar to!!!! ALU SUB 27 

118449 Hs.164478 N66413 ESTsWeaHysimflar to (defiine not ava 27 

134507 H&84318 M63488 replication protein A1 (70kD) 27 

20 121609 Hs.98185 AA416867 EST 2.7 

113835 H&27475 W56590 ESTs 27 

113962 H&285290 W86375 ESTs; Highly similar to (defiine not ava 27 

121913 Hs.98558 AA428062 ESTs 2.7 

108194 H&216717 AA057250 ESTs 2.7 

25 130799 Hs.12696 AA464273 ESTs 27 

123184 Hs.18166 AA489072 Homo sapiens mRNA for KIAA0870 protein; 2.7 

103420 Hs.173497 X97065 SEC23-!ike protein B 27 

106186 Hs.6315 AA427398 acetyiserotonin N-methyHransferase-like 27 

101349 L77559 Homo sapiens DGS-B partial mRNA 27 

30 112954 Hs.6655 T16559 ESTs 27 

133054 HS291079 R07876 ESTs; WeaWy similar to unknown [Sxerev 27 

128131 H&25640 AI283162 claudin3 2.6 

101864 Hs.75777 M95787 transgelin 2.6 

111948 Hs.26303 R40752 ESTs 2.6 

35 130145 Hs.151051 U07620 protein kinase mitogen-activatBd 10 (MAP 2.6 

126507 HS23964 AI362218 ESTs 2A 

117903 HS47111 N50740 ESTs 2* 

116345 Hs.199067 AA496981 ESTs 2.6 

132227 Hs.4248 AA412620 ESTs 2.6 

40 125746 HS274256 H03574 y}42b06x1 Soares placenta Nb2HP Homo sa Z£ 

105073 Hs.89463 AA137034 ESTs 2.6 

102764 U82310 Homo sapiens unknown protein mRNA, parti 2.6 

131367 Hs.173933 AA456687 ESTs 2.6 

130792 Ks.19500 AA307896 nuclear localization signal deleted in v 2.6 

45 1074Z7 Hs.46736 W26975 ESTs 2* 

117477 Hs.44175 N30328 ESTs 2£ 

106290 Hs.16364 AA435542 ESTs 2£ 

126829 Hs.7910 R11547 ESTs 2.6 

118836 Hs.173001 N79820 ESTs 2.6 

50 100147 Hs.136348 D13666 - osteoblast specific factor 2 (fasdcfin 2.6 

104278 Hs.109253 C02582 ESTs; Highly similar to (defiine not ava 2.6 

135051 H&83484 C15324 ESTs 2.6 

126081 Hs.227835 AI346024 collagen; type I; alpha 1 - 2.6 

123579 AA608983 af564.s1 Soares Jestis.NHT Homo sapiens 2.6 

55 130115 Hs.149923 M31627 X-box binding protein 1 2.6 

101434 Hs.1430 M20218 coagulation factor XI {plasma thrombopla 2.6 

122962 Hs.104720 AA478429 ESTs; Moderately similar to Hi! ALU SUB 2.6 

126151 Hs.40808 AA324743 ESTs 2.6 

128925 Hs.21851 D61676 Homo sapiens mRNA; cDNA DKFZp586J21 1 8 (f 2.6 

60 128919 Hs.103391 L27559 insuCrvlike growth factor binding prate 2.6 

130296 Hs.154103 R09286 UM protein (similar to rat protein Idna 2.6 

128402 Hs.191637 AA457244 ESTs 2.6 

129273 Hs.109968 W63783 ESTs 2.6 

125483 Hs.7788 F07759 . ESTs 2.6 

65 132953 Ks.321264 AA029927 ESTs 2.6 

130963 Hs21639 U57099 nuclear protein; marker for differential 2.6 

120614 Hs.194154 AA284281 ESTs; Weakly similar to IIII ALU SUBFAMi 2.6 

123251 Hs.103267 AA490858 ESTs; Moderately similar to Rabln3 (Rjio 2.6 

121710 Hs.96744 AA419011 ESTs 2.6 
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12542B Hs£51 W74608 ESTs; Highly similar to (deffine not ava 2.6 

115906 Ha.82302 AA436616 ESTs 2.6 

108432 AA076626 Homo sapiens clone 23851 mRNA sequence 2.6 

126191 Hs.191911 H97728 ESTs 2.6 

5 106164 H&281434 AA425773 ESTs 2.6 

111519 H&268615 R08165 ESTs 2.6 

134590 Hs.173840 W58612 ESTs 2.6 

102565 U59748 Human desert hedgehog (hDHH) mRNA, parti 2.6 

129879 Hs.13109 AA194973 ESTs 2.6 

10 114264 Hs334609 Z4O074 ESTs 2.6 

106236 HS21104 AA429951 ESTs 2.6 

135192 Hs.321709 AF000234 purinergte receptor P2X; ligand-gatBd to 2.6 

109833 Hs29B89 H00580 ESTs 2.6 

105756 Hs.8535 AA303088 ESTs; Weakly similar to transformaBon-r 2.6 

15 121422 Hs£7967 AA406210 ESTs 25 

130417 Hs.155485 U58522 Human huntingtin interacting protein (HI 25 

124312 Hs.102329 H94647 ESTs 2.6 

108998 Hs.97199 AA156058 ESTs 2.6 

127081 Hs.180591 R88362 ESTs; Weakly similar to weak similarity 25 

20 129574 Hs.11463 AA458603 ESTs; Weakly similar to (deffine not ava 2.6 

112410 Hs£6904 R61680 ESTs 2.6 

123929 Hs.1 12981 AA621364 ESTs 2.6 

122905 Hs.104835 AA47O070 ESTs 2.6 

116399 Hs.110637 AA599729 Homo sapiens homeobox protein A10(HOXA1 25 

25 130279 Hs.153934 AA424044 core-binding factor; runt domain; alpha -2.6 

130021 Hs.1 435 M24470 guanosina monophosphate reductase 2.6 

100585 Hs.199160 HG2367-HT2463 Trithorax Homolog Hrx 2.6 

104965 Hs-30177 AA084104 ESTs -2.6 

117711 Hs.46485 N45201 EST 25 

30 124792 Hs.48712 R44357 ESTs 25 

111299 Hs.74313 N73808 ESTs 2.6 

103616 Hs.32971 Z46973 'phosphoinositide-3-kinase; class 3 2.6 

133629 Hs.1 95614 D13642 K1AA001 7 gene product 2.6 

126484 Hs.169977 AI086782 ESTs 2.6 

35 100856 HG4245-HT4515 Forkhead Family AM 2.6 

133547 Hs501927 X02883 T-cell receptor; alpha (V;D;J;C) 2.6 

126680 Hs.133855 F07097 ESTs 2.6 

125739 H&92137 AA428557 v-myc avian myelocytomatosis viral oncog 2.6 

102276 Hs.10247 U30999 Human (memc) mRNA, 3UTR 2.6 

40 105586 Hs.191538 AA279137 ESTs 2.6 

103978 H&34136 AA307443 ESTs 2.6 

125054 Hs.268601 T80622 ESTs; Weakly similar to (deffine not ava 2.6 

114212 H&21201 Z39338 ESTs; Highly similar to (deffine not ava 25 

116959 H&40Q22 H79310 EST 16 

45 109228 Hs.306995 AA193366 ESTs 2.6 

133989 Hs.78202 U29175 SWl/SNF related; matrix associated; acti 2.6 

100640 Hs.182183 HG2743-HT2845 CaMesmon 1, AH SpBce 3, Non-Muscle 2.6 

133093 HS285996 AA598749 ESTs 25 

114306 Hs.6540 Z40861 ESTs 2j6 

50 106060 Hs.171391 AA417287 C-terminal binding protein 2 2.5 

107748 Hs.60772 AA017258 EST 25 

100134 Hs.49 D 13284 macrophage scavenger receptor 1 15 

133969 Hs.78 U13044 GA-binding protaln transcription factor; - 25 

130992 Hs.74316 AA455001 ESTs 25 

55 127493 HS291701 AA808081 oc39a08.s1 NCLCGAP_GCB1 Homo sapiens cD 25 

132869 H&2039S1 N26855 ESTs 25 

117570 H&44583 N34415 EST 25 

124644 Hs.109654 N91279 ESTs 25 

103558 Hs2785 Z19574 keratin 17 25 

60 132883 Hs5897 AA047151 ESTs 25 

102009 Hs.82643 U02680 protein tyrosine kinase 9 25 

116058 HS50159 AA454156 ESTs 25 

121989 Hs.193784 AA430044 ESTs 15 

131257 Hs.24908 AA256042 ESTs 25 

65 100320 Hs.75275 D50916 homolog of yeast (S. cerevtsiae) ufd2 25 

102959 Hs.121524 X15722 glutathione reductase 25 

132969 Hs.6166 AA047616 ESTs 25 

130869 Hs.2057 AA128100 uridine monophosphate synthetase (orotat 25 

129645 Hs.1 16131 L38928 5;10-methenyitelrahydrofolate synthetase 25 
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126399 Hs£38B3 AA128075 2l16d08j1 SoaresjjregnanLuterus.NbHPU 2.5 

134069 Hs.78935 U29607 Homo sapiens elF-2-associated p67 homolo 2.5 

109816 Hs.61960 P11013 ESTs; Weakly similar to K1AA01 76 [H^api 25 

134601 H&89695 X02160 insulin receptor 25 

104232 Hs.10587 AB002351 Human mRNA for KIAA0353 gene; partial cd 2JS 

107361 Hs.159486 U72513 Human RPL13-2 pseudogene mRNA; complete 2.5 

105057 H&289074 AA417067 ESTs 25 

134252 Hs.80720 AA031762 Homo sapiens mRNA; cDNA DKFZp586B1722 (f 25 

128062 Hs.105547 AA379500 ESTs 25 

110009 Hs.6614 H10933 ESTs 25 

111375 Hs*0432 N93696 ESTs 2.5 

122642 H&99361 AA454186 ESTs 25 

127999 Hs.69851 AA83749S ESTs; WeaWy simSar to Wiskott-Aldrich 25 

105029 Hs.13268 AA126855 ESTs 25 

105082 H&26765 AA143763 ESTs; Weakly slmi^r to Similarity to S. 25 
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TABLE 1 A show the accession numbers for those primekeys lacking unigenelD's for Table 
1. For each probeset we have listed the gene cluster number from which the oligonucleotides 
were designed. Gene clusters were compiled using sequences derived from Genbank ESTs 
and mRNAs. These sequences were clustered based on sequence similarity using Clustering 
and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers 
for sequences comprising each cluster are listed in the "Accession" column. 
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Pkey: Unique Eos probeset Identifier number 

CAT number Gene duster number 

Accession: Genbank aooesston numbers 



60 



65 



PJoay CAT number 



126086 
102565 
101964 
125499 
125598 
118417 



11 1555 J 

1596090J 

1608216J 

32479J 

48158_-7 

1562851.1 

1708455.1 

37186.1 



125661 
125957 
125982 
127248 
103731 
127261 
127265 
126659 
127315 
103806 
128104 
104602 
128152 
128422 
127897 
106566 



327827.1 

1583542.1 

1766315.1 

227560.1 

112052.1 

231687.1 

232391.1 

1541209J 

37938 1 

112618 J 

502608.1 

524482.2 

297868.1 

1811283.1 

446527.1 

120358.1 



50 129735 44573^ 



55 



123147 
130529 
123579 
109175 
100789 
100858 



AA071210 AA069899 AA071438 AA084912 AA084803 AA079371 AA079370 
H57661 H58881 
H75681H70975 
AB010994 U59748AA064660 
S81578 

H10543R11878 
R25698 R56582 R56018 

AF08Q229 AF080231 AF080230 ARJ80232 AF08Q233 AF080234 BE550633 AI636743 AW614951 BE467547 A1680833 
AI633818 N29986 U87592 UB7593 U87590 U87591 S46404 U875B7 AA463992 AW206802 AI970376 AI583718 AI672574 
N25695 AW665466 AI818326 AA 126 128 A1480345 AW013827 AA248638 AI214968 AA204735 AA207155 AA206262 
AA204833 AW003247 AW496808 M080480 A1631703 A/651023 AI86741 8 AW818140 AA502500 A1206189 AI671282 
AI352545 BE501030 A1652535 BE465762 AA206331 AW451866 AA471088 AA206342 AA204834 AA206100 AW021661 
AA332922 N66048 AA703396 H92278 AW139734 H92683 U87589 U87595 H69001 U87594 BE468420 AI624817 
BE466611 AI206344AA574397AA346354AW93192 

AA491830 R50173 R55192 R50320 AI732306 AI732305 A1820727 AI820728 R55191 R50319 R50227 
H41694H45213 
R98091 W9289B 

AA364195 AA325029 AW962050 
AA070545 AA131490 AA131373 
AA330501 AA661567 
AA331503 AA332751 AW962542 
T16245 R19694 F13545 H10299 T66048 T65279 H18006 
AF1 16622 All 14507 AA640834 AA377999 
AA130614AA071410 
AA906093AA971000 
H47610 R86920 
F07973R20353AA442660 
T77794 T85681 
AA773681 AA773857 

BE288210 AI672315 AW086489 BE298417 AA455921 AA902537 BE327124 R14963 AA085210 AW274273 A1333584 
A1369742 A1039658 AI885095 A1476470 A1287650 AI885299 AI985381 AW592S24 AW340136 A1266556 AA456390 
AI310815 AA484951 

A1950087 N70208 R97040 N36809 A13081 1 9 AW967677 N35320 A1251473 H59397 AW971573 R97278 W01059 
AW967671 AA908598 AA251 875 AI82O501 AI820532 W87891 T85904 1)71456 T82391 BE328571 T75102R34725 
AA884922 BE328517 M219788 AA884444 N92578 F13493 AA927794 AI560251 AW874068 AL134043 AW235363 
AA663345 AW008282 AA488964 AA283144 AI890387 A1950344 AI741346 AI689062 AA282915 AW1Q2898 AI872193 
AI763273 AW173586 AW150329 AI653832 AI762688 AA988777 AA488892 AI358394 AW103813 AI539642 AA642789 
AA856975 AW505512 AI961530 AW629970 BE612881 AW276997 AW5 13601 AW512843 AA044209 AW856538 
AA180009 AA337499 AW961 101 AA251669 AA251874 A1819225 AW205862 AI683338 AI858509 AW276905 AI633006 
AA972584 AA90B741 AW072629 AW513996 AA293273 AA969759 N75628 N22388 H84729 H60052 T92487 AI022058 
AA780419 AA551005 WB0701 AW613456 AI373Q32 AI564269 F00531 H83488 W37181 W7B802 R66056 AI002839 
R67840 AA30Q207 AW959581 T63226 F04005 ' 
219802.-2 AA487961 
158447.1 AA1 78953 AA192740 

genbanK_AA608983 AA608983 
genbanK_AA180496 AA180496 
tigr_HT4163 S67998 
tlgr.KT4515 U10072 
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10 



15 



20 



123798 
102116 
102398 
102764 
118475 
104776 
104787 
113702 
113938 
122635 
108407 
108432 
108555 
101349 
124447 
118071 
103520 
103663 
128046 



579959.1 

entra?JJ13706 

entre^U42359 

entre^.U82310 

genbank_N66845 

genbanK_AA026349 

genbanlLAA027317 

genbankJ97307 

genbank_W81598 

genbanK_AA454085 

genbanK_AA075519 

genbanK_AA076626 

genbanKJ\A084963 

entrez_L77559 

genban(L.N48000 

genbanK_R31180 

entre*_Y10511 

genbanK_Z78291 

877605J 

546044J 

genbanK_AA599033 



AA820411 AA287491 

U13706 

U42359 

U82310 

N66845 

AA026349 

AA027317 

T97307 

W81598 

AA454085 

AA075519 

AA076626 

AA084963 

L77559 

N48000 

R31160 

Y10511 

Z76291 

AA873285AI025762 
AA1 99853 AA206355 
AA599033 
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TABLE 2: shows a preferred subset of the Accession numbers for genes found in Table 1 
which are differentially expressed in prostate tumor tissue compared to normal prostate 
tissue. 



Pkey: Unique Eos prabeset Identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigeneiD: Unigene number 

Un^ene Title; Unigene gene title 

R1 : Ratio of tumor to normal body tissue (Relaxed ratio (87/70) 



ExAccn UnigeneiD Unigene Title 



H&272458 

Hs5909Q5 

HS.1852 

H&279477 

Hs.183752 

Hs.171995 

H&57771 

Hs.162859 

Hs5Q343 

Hs.1832 

Hs.1915 

HSL262476 



31919 AA121266 

20328 AA196979 

01486 M24802 

19073 R32894 

33426 M34376 

128180 AA595348 

04080 AA402971 

I27537 AA569531 

31665 R22139 

01050 K01911 

30771 N48056 

07485 W63793 

106155 AA425309 

I29534 R73640 

00569 HG2261-HT2351 

101889 S39329 HS.1B1350 

35389 U05237 

33944 AA045870 

30974 X57985 

14768 AA149007 

04660 AA007160 

131061 N64328 

26645 AM 67942 

35153 N40141 

07033 AA599629 

18417 N66048 

26758 W37145 

07102 AA609723 

16787 H28581 

15719 AA416997 

23209 AA489711 

01664 M60752 

12971 T17185 

17984 N51919 

29523 M30894 

32964 AAD31360 

121853 AA425887 

19617 W47380 

05627 AA281245 

01461 M22430 



33845 T68510 

33354 AA055552 

19018 N95796 

00394 D84276 

106579 AA456135 

14965 AA250737 

12033 R43162 

02398 U42359 

01201 L22524 

01803 M86546 

,20562 AA280036 



Hs.11260 



Hs.99872 

Hs.7780 

Hi2178 

Hs.182339 

Hs.14846 

Hs568744 

Hs.61635 

Hs55420 

Hs.113314 



ESTs 

ESTs; Weakly similar to (deflina not ava 

add phosphatase; prostate 

ESTs 

mioDseminop rotein; beta- 

kallikrein 3; (prostate specific antigen 

Homo sapiens mRNA for serine protease (T 

ESTs 

ESTs 

neuropeptide Y 

folate hydrolase (prostate-specific memb 
S-adenosyimethlonine decarboxylase 1 
ESTs 
ESTs 

kaJIikrein 2; prostatic 
fetal Alzheimer antigen 
ESTs 

H2B hlstone family; member Q 
ESTs 



Hs.293960 
Hs.30652 
Hs.15641 
Hs59622 
HS5Q3270 
Hs.121017 
Hs.83883 
Hs.106778 
HS274509 
Hs.167133 
Hs58502 
Hs55999 
Hs53317 
Hs.76422 
H&293185 
Hs.76704 
Hs534762 
Hs.278695 
Hs56052 
Hs.23023 
Hs.72472 
Hs.22627 

H&2256 

Hs.155691 

Hs.302267 



ESTs; Moderately similar to KIAA0273 (H. 
Homo sapiens BAG clone RQ041D11 from 7q2 10.7 

Homo sapiens mRNA for JM27 protein; comp 1 05 

ESTs 105 

ESTs; Weakly simBar to polymerase [H.sa 1 05 

ESTs 102 

ESTs 10.1 

ESTs 10.1 

ESTs 10 

ESTs 9.9 

H2A histone family; member A 9.8 

ESTs 9J 

ESTs 9.7 

T-ceii receptor; gamma cluster 94 

ESTs 95 

ESTs 9 

ESTs 8.9 

ESTs 8.8 

phospholipase A2; group IIA (platelets; 8.7 

yz61c5.s1 Soares.muIfipte^sclerosisJNbH 85 

ESTs 8.2 

ESTs; Weakly similar to KIAA031 9 [Ksapi 8.1 

ESTs 8 

CD38 antigen (p45) 8 

ESTs 7.6 

ESTs 7.4 

ESTs 7.1 

Human N33 protein form 1(N33) gene, exo 7 

matnxmetaltoproteinase7(iratn1ys^ 65 

pre-B-cell leukemia transcription factor 65 

ESTs; Weakly similar to W01A6.C [C.etega 65 
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R1 

37.2 
32.6 
255 

235 

21.4 
18.9 

m 

MA 

17-3 

17 

16.7 

165 

164 

Antigen, Prostate Specific, Ait Splice 1 6 
154 
15 
125 
115 
115 
1U 
105 
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109112 AA169379 Hs257924 ESTs 63 

109795 F10707 Hs.326416 ESTs 67 

130336 X07730 Hs.171995 KaEkreln 3; (prostate specific antigen 6.6 

131425 AA219134 HS36691 ESTs 6.6 

5 132902 M490969 Hs39838 ESTs 63 

133724 U07919 Hs.75746 aldehyde dehydrogenase 6 63 

120215 241050 Hs.108787 Homo sapiens Mod4p homotog mRNA; comptet 63 

131681 AA010163 Hs.3383 upstream regulatory element binding prat 63 

100727 X07290 H&334786 Human HF.12 gene mRNA 63 

10 121770 AA421714 HSJ278428 Homo sapiens mRNA for K1AA0896 protein; 63 

123475 AA599267 H&250526 ESTs; Weakly simitar to ANKYRIN; BRAIN V 63 

133061 AB0005B4 Hs396638 prostate differentiation factor 63 

116429 AA609710 Hs.279923 ESTs; Weakly similar to similar to GTP-b 6.2 

101233 L29008 Hs378 sorbitol dehydrogenase 63 

15 104691 AA011176 Hs37744 ESTs 63 
127248 AA325Q29 EST27953 Cerebellum II Homo sapiens CDNA6.2 

105500 AA256485 H&322399 ESTs 6.1 

130828 AA0534O0 H&203213 ESTs 53 

115357 AA281793 Hs.72988 ESTs 53 

20 116334 AA491457 Hs.48948 ESTs 5.7 

120132 Z38B29 Hs.125019 ESTs; Weakly similar to [111 ALU SUBFAMI 53 

106375 AA443993 Hs389072 ESTs 53 

124777 R41933 Hs.140237 ESTs; Weakty simitar to neuronal thread 53 

101791 M83822 Hs.62354 Human beige-like protein (BGL) mRNA; par 53 

25 117698 N410Q2 Hs.45107 ESTs 53 

122041 AA431407 Hs38732 Homo sapiens Chromosome 16 BAG clone CIT 53 

133723 AA088851 H&262476 S^denosylmathionine decarboxylase 1 53 

113938 W81598 ESTs 5.4 

133015 AA047O36 Hs346315 ESTs 5.4 

30 108186 AA056482 Hs.7780 ESTs 53 

104466 N25110 Hs326392 Human guanine nucleotide exchange factor 53 

104033 AA365G31 Hs38944 ESTs 53 

110844 N31952 Hs.167531 ESTs; Weakfy similar to (deffine not ava 53 

129056 H70627 Hs.108336 ESTs; Weakly similar to HI! ALU SUBFAMI 53 

35 133493 AA2B4143 Hs.1 94369 Homo sapiens chromosome 1 atrophin-1 rel 53 

129184 W26769 Hs.109201 ESTs; Highly similar to (defEne not ava 53 

101448 M21389 Hs.195850 keratin 5 (epidermolysis bullosa simplex 5.1 

116188 AA464728 Hs.1 84598 ESTs; Weakly simSar to W ALU SUBFAMI 5.1 

105921 AA402613 Hs.169119 ESTs 5.1 

40 103375 X91868 Hs34416 sine ocufis homeobox (DrosophUa) homolo 5.1 

128871 AA40O271 Hs.106778 ESTs; Highly similar to (deffine not ava 5.1 

116238 AA479362 Hs^7144 ESTs 5 

102913 X07696 Hs30342 keratin 15 5 

103011 X52541 Hs326035 early growth response 1 5 

45 118981 N93839 Hs39288 ESTs; Weakly similar to Hi! ALU SUBFAMI 5 
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TABLE 2A shows the accession numbers for those primekeys lacking unigenelD's for Table 
2. For each probeset we have listed the gene cluster number from which the oligonucleotides 
were designed. Gene clusters were compiled using sequences derived from Genbank ESTs 
and mRNAs. These sequences were clustered based on sequence similarity using Clustering 
and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers 
for sequences comprising each cluster are listed in the "Accession" column. 



10 



15 



20 



25 



Pkey: Unique Eos probeset identifier number 

CAT number Gene duster number 

Accession: Genbank accession numbers 



Pkey 



CAT number Accession 



118417 37186 1 AF03Q229 AF08Q231 ATO80230 AF080232 AF080233 AF080234 BE550S33 A1636743 AW614851 BE407547 AJB80833 

AI633818 N29986 U87592 U87593 U87590 U87591 S46404 U87587 AA463992 AW206802 AI970376 A1583718 A1672574 
N25695 AW565466 AI818326 AA126128 AI480345 AW013827 AA248638 AK14968 AA204735 AA207155 AA206262 
AA204833 AW003247 AW496808 AI080480 AI631703 AI651 023 AI887418 AWB18140 AA502500 AI206199 AI671282 
A1352545 BE501030 A1652535 BE465762 AA206331 AW451866 AA471088 AA206342 AA204834 AA206100 AWQ21661 
AA332922 N66048 AA703396 H92278 AW139734 H92683 U87589 U87595 H69001 U87594 BE466420 AI624817 
BE466611 A1206344 AA574397 AA348354 A14931 92 

127248 227560 1 AAS64195 AA325029 AW962050 

107033 235652~1 AI141999 M730176 R44544 R41778 AW300793 AW966157 AA918501 AA599629 AI082195 AI198537 AW006520 

AW236663 AW151420 A1826S87 A1810832 AI6691Q2 AI201981 N27331 AA335566 TB4622 BE085347 BE085269 
102398 entre*JJ42359 U42359 
113938 genbankW81598W81598 
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TABLE 3: shows genes, including expression sequence tags, differentially expressed in 
5 prostate tumor tissue compared to normal tissue as analyzed using the Affymetrix/Eos Hu02 
GeneChip array. Shown are the relative amounts of each gene expressed in prostate tumor 
samples and various normal tissue samples showing the highest expression of the gene. 



10 



15 



Ptey: Unique Eos probeset Identifier number 

ExAccru Exemplar Accession number, Genbank 

UnigenelD: Unigene number 

Unigene Title: Unigene gene title 

R1: Ratio of tumor to normal body tissue 



Pkey ExAccn UnigenelD Unigene Title 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



00131 
00235 
00570 
00819 
01063 
01247 
01416 
01447 
101485 
01514 
01626 



D12465 
D29954 



Hs.11951 
Ha.13421 



HG2261-HT2352 
HG4C20-HT4290 
L00354 Hs.80247 



L33801 
M17254 
M21305 
M24736 
M2B214 
M57399 



01758 
101768 
01817 
01888 
102031 
02052 
02221 



M77836 
MB1118 
M88163 
M99701 



U07559 
U24576 
U26173 



02457 
02473 



02751 
02823 



03031 
03043 
103093 
03376 
103401 
03613 
03677 
03962 
04084 
04257 
104301 
I04769 
04851 
I04896 
04956 
i 04957 
04967 



U37519 
U48807 
U49957 
U71207 
U75272 
U80034 
U90914 
X02544 
X54667 
X55733 
X60708 



X9524C 

Z46629 

Z63806 

AA298180 

AA410529 

AF006265 

D45332 

AA025887 

AA040882 

AA054228 

AA074880 

AAQ74919 

AA084506 

AA150776 



Hs.78802 
HS279477 

Hs.89546 

Hs.123072 

Hs.44 

Hs2178 

Hs.79217 



Hs.152292 

Hs.95243 

Hs2156 

H$£Q5 

Hs.3844 

Hs.79334 

Hs.69171 

Hs.87539 

Hs2359 

Hs.180398 

Hs29279 

Hs.1867 

Hs.68583 

Hs.5057 

H$572 

Hs.123114 

Hs-93379 

Hs.44926. 

H&323379 

Hs54431 

H&2316 

Hs33243 

Hs30732 

Hs.9222 

Hs.6783 

Hs293943 

Hs.10290 

Hs23165 

Hs20509 

Hs.10026 

HS291000 

Hs23729 



phosphodiesterase t/nucieotide pyiophosp 

KIAA0O56 protein 

Hs.171995 

Hs2387 

cholecystokinin 

glycogen synthase kinase 3 beta 
v-ets avian erythroblastosis virus E26 o 
Human alpha satellite and satellite 3 ju 
seiectm E (endothelial adhesion moiecul 
RAB3B; member RAS oncogene family 
ptelotrophln (heparin binding growth fac 
H2B histone family; member A 
pyrrorine-5-carboxytate reductase 1 

SWl/SNF related; matrix assodated; ad 



I05298 AA233459 H&26369 



RAR-related orphan receptor A 

ISL1 transcription factor; UMmomeodoma 

UM domain only 4 

nuclear factor; interleukin 3 regulated 

protein kinase C-lite 2 

aldehyde dehydrogenase 8 

dual specificity phosphatase 4 

UM domaln-containmg preferred transtoc 

eyes absent (DrosophQa) homolog 2 

progastrlcsfn (pepsinogen C) 

mitochondrial intermediate peptidase 

carboxypeptktaseD 

orosomucoid 1 

cystatinS 

eukaryotic translation Initiation factor 
dipeptidylpeptidase IV (C026; adenosine 
coated vesicle membrane proton 
specific granule protein (28 kDa); cyste 
SRY (sex^tBrrnining region Y>box 9 (ca 
H.sapiens mRMA for axonemal dyneln heavy 
ESTs 
ESTs 

estrogen receptor-binding fragment-assoc 
ESTs 

ESTs; Weakly similar to 111! ALU SUBFAMI 
U5 snRNP-specific40 kDa protein (hPrp8- 
ESTs 

ESTs; Weakly similar to hypothetical pro 
ESTs; Weakly similar to ORF YJL063C [S.c 
ESTs 

Homo sapiens done 24405 mRNA sequence 
ESTs 

121 



R1 

63 
5.1 

Antigen, Prostate Specific Alt SpBce 

Transgtutamlnase 105 

85 

4.7 

4.7 

11 

9.8 

62 

8.4 

43 

5.4 

75 

5-5 

5.7 

132 

8.9 

53 

1A 

82 

5.9 

5.1 

5.7 

9 

10.6 

15j6 

43 

22.6 

4.7 

43 

5.8 " 

52 

7.4 

52 

4.9 

6 

64 

63 

105 

63 

4.9 

53 

6.4 

43 

65 

7 

5.1 
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105304 AA233553 Hs.1 90325 ESTs 4.7 

105370 AA236476 Hs22791 ESTs; Weakly similar to transmembrane pr 103 

105427 AA251330 Hs.28248 ESTs 5 

105542 AA261858 Hs.266957 ESTs; WeaWy simBar to heat shock prote 83 

5 105628 AA281251 Hs.79828 ESTs; Weakly similar to putative zinc fl 5.6 

105640 AA281623 Hs.6685 ESTs; Weakly similar to KIAA0742 protein 8 

105645 AA282138 Hs.1 1325 ESTs 14 

105691 AA287097 Hs289068 transcription factor 4 63 

105730 AA292701 HS5364 DKFZP5641052 protein 43 

10 105808 AA393808 Hs386131 WAA0438 gene product 7 

105826 AA398243 Hs.194477 ESTs; Moderately similar to sim9ar to N 5 

105903 AA401433 H&200016 ESTs; WeaWy similar to dlphosphoinosito 93 

105906 AA401633 HS32380 ESTs 115 

106065 AA417558 HSJ25206 ESTs 5.1 

15 106094 AA419481 Hs33317 ESTs 103 

106157 AA425367 HS34892 ESTs 6.6 

106184 AA426643 Hs.10762 ESTs 83 

106211 AA428240 Hs.126083 ESTs 84 

106213 AA428258 Hs3769 Homo sapiens mRNA; cONA DKFZp564E153 (fr 5.7 

20 106272 AA432074 H&323099 ESTs 53 

106369 AA44382B H&288856 ESTs 63 

106400 AA447621 Hs.94109 ESTs BA 

106474 AA450212 Hs.42484 Homo eapians mRNA; cONA DKFZp564C053 (fr 92 

106507 AA452584 H&267819 .protein phosphatase 1 ; regulatory (inhib 53 

25 106523 AA453441 Hs31511 ESTs 4.7 

106532 AA453628 Hs37443 ESTs 4.7 

106557 AA455087 H&22247 ESTs 5.7 

106575 AA456039 Hs.105421 ESTs 72 

106618 AA459249 Hs3715 ESTs; Weakly similar to Similarity with 53 

30 106820 AA481037 Hs.12592 ESTs 5.4 

106846 AA485223 Hs.34892 ESTs 53 

106973 AA505141 Hs.1 1923 Human DNA sequence from done 167A19 on 73 

107110 AA609952 Hs.12784 KIAA0293 protein 6.1 

107127 AA620504 Hs.179898 ESTs 7.1 

35 107159 AA621340 Hs.10600 ESTs; Weakly similar to ORF YKR081 c [S.c 5.2 

107217 D51095 Hs35861 0KFZP586E1621 protein 15.1 

107365 U78294 Hs.1 11256 arachWonate 15-lipoxygenase; second typ 4.7 

107630 AA007218 H&60178 ESTs 53 

107734 AA016225 Hs.7517 ESTs 43 

40 107760 AA018042 H&252085 EST 73 

107997 AA037388 Hs32223 Human DNA sequence from clone 141H5 on c 105 

108012 AA039616 Ks.173334 ESTs 63 

108520 AA084138 Hs.46786 ESTs 73 

108583 AA088276 Hs3B826 ESTs 53 

45 108613 AA100967 Hs.69165 ESTs 6 

108664 AA113349 Hs.69588 EST 63 

108677 AA115629 Hs.118531 ESTs 53 

108807 AA129968 Hs.49376 ESTs; WeaWy similar to PROTEIN PHOSPHAT 53 

108910 AA136590 ESTs 5 

50 108933 AA147224 Hs337232 ESTs 12.7 

108948 AA149579 Hs.118258 ESTs 63 

109014 AA156790 Hs362036 ESTs 153 

109124 AA171529 Hs.183887 ESTs 6.1 * 

109142 AA176438 Hs.41295 ESTs 5.1 

55 109277 AA196332 Hs.86043 ESTs 55 
109342 AA213620 Homo sapiens mRNA; cDNA DKFZp586M1418 (f6 

109562 F01811 Hs.187931 ESTs^ModerateVsirrtlajtovoitage^ate 103 

109565 F01930 Hs33648 ESTs 7 

109648 FO460O Hs.7154 ESTs 9.9 

60 109799 F10770 Hs.1 80378 Homo sapiens clone 669 unknown mRNA; com 6.4 

109859 H02308 H&20792 ESTs 53 

110181 H20276 Hs31742 ESTs 163 

110654 N32919 Hs.27931 ESTs 10 

110924 N47938 Hs.1 2940 yy84a09.s1 Soares„muftiple_sci8rosis_2Nb 53 

65 111046 N55514 Hs318584 ESTs 63 

111091 N59858 Hs33032 Homo sapiens mRNA; cONA DKFZp434N185 (fr 52 

111157 N66613 Hs39364 ESTs 5 

111164 N66857 Hs.122489 ESTs; WeaWy similar to Hil ALU CLASS C 55 

111221 N68869 Hs.15119 ESTs 63 

122 



WO 02/30268 



PCT/US01/32045 



111348 N90041 H&3585 ESTs 5.4 

111353 N90430 Hs.6616 ESTs 53 

111495 R07210 HS3683 ESTs 5.8 

111540 R08850 Hs3786 ESTs 6 

5 111579 R10657 Hs.167115 KIAA0830 protein 123 

111581 R10684 Hs3794 ESTs 7.1 

111734 R25375 Hs.128749 ESTs 63 

111861 R37460 Hs35231 ESTs U 

111870 R37778 Hs.18685 ESTs; Weakly similar to hypothetical pro 63 

10 111937 R40431 Hs.14846 Homo sapiens mRNA; cONA DKFZp5S4D01 6 (fr 43 

111987 R42036 Hs.6763 KIAA0942 protein 6.4 

112184 R49173 H&330242 ESTs 5.6 

112286 R53765 Hs.158135 KIAA0981 protein 93 

112380 R59740 Hs3740 ESTs 4.7 

15 112452 R63841 Hs.157461 ESTs 6 

112601 R79111 Hs.78225 annexinAI BA 

112753 R93696 Hs.169882 ESTs 53 

112902 T09262 Hs.129190 ESTs 5.1 

112984 123457 Hs389014 ESTs 43 

20 113021 T23855 Hs.129836 KIM1028 protein 103 

113083 T40530 Hs366957 EST s; Weakly similar to heat shock prate 5.7 

113200 T57773 Hs.10263 ESTs 73 

113494 T88878 Hs36538 ESTs 8.7 

113849 W60439 Ks385B ESTs; Moderately similar to cbp146[Mjnu 43 

25 113883 W72382 Hs.11958 oxidative 3 alpha hydroxysterold dehydro 4.7 

113950 W85765 Hs30504 Homo sapiens mRNA; cDNA DKFZp434E082 (fr 6.7 

113986 W87462 H&31894 ESTs 53 

113989 W87544 Hs36BB28 ESTs 4.7 

1 14124 Z38595 Hs.125019 ESTs; Highly slmBarto KIAA0886 protein 213 

30 114340 Z41395 Hs.143611 ESTs 9.6 

114346 Z41450 Hs.130469 ESTs 53 

114435 AA018216 Hs.164975 Bkaudal D (Drosophila) homolog 1 7.4 

114463 AA025370 Hs.40109 KIAA0872 protein 83 

114652 AA101416 Hs.1 07149 ESTs; Weakly similar to PTB-ASSOCIATEDS 5.4 

35 114721 AA131450 Hs.103822 ESTs 43 

114730 AA133527 Hs.331328 ESTs; Weakly similar to The KIAA01 38 gen 5.1 

114633 AA234362 Hs37159 ESTs; Moderately similar to CGl-66 prate 53 

114860 AA235112 Hs.42179 ESTs; Moderately similar to similar to m 63 

114884 AA235811 Hs393672 ESTs 53 

40 114895 AA236177 Hs.76591 K1AA08S7 protein 4.7 

114908 AA236545 HS34973 ESTs 53 

114932 AA242751 Hs.16218 WAA0903 protein 5.7 

115084 AA255566 H&42484 Homo sapiens mRNA; cDNA DKFZp564C053 (fr 53 

115140 AA258030 Hs379938 ESTs; WeaMy similar to supported by GEN 53 

45 115468 AA287061 Hs.48499 ESTs; Highly similar to Bdeight protein 4.7 

115583 AA398913 Hs.45231 LDOC1 protein 73 

115709 AA412519 Hs38279 ESTs 43 

115772 AA423972 Hs.1 31 740 ESTs 5 

115774 AA424029 Hs388390 ESTs; Moderately similar to dynamin; int 5A 

50 115776 AA424038 Hs31897 ESTs 5 

115821 AA427528 Hs.130965 ESTs; Weakly similar to ZINC RNGERPROT 13.7 

115955 AA446121 Hs.44198 Homo sapiens BAG clone RG054D04 from 7q3 103 

116024 AA451748 Hs33883 Human DNA sequence from clone 71 8J7 one 63 - 

116108 AA457566 Hs38777 ESTs 6 

55 116117 AA459117 Hs31575 SEC^;eno^lasmicreflculumtrardcx»n 73 

116146 AA460701 Hs.15423 ESTs 53 

116296 AA489033 Hs.62601 Homo sapiens mRNA; cDNA DKFZp586K1318 (f 5.7 

116379 AA521472 Hs.71252 ESTs 53 

116393 AA599463 Hs306051 protein phosphatase 2 (formerly 2A); reg 53 

60 116401 AA599963 Hs39698 ESTs 73 

116416 AA609219 Hs39982 ESTs 93 

116587 D59325 Hs.121429 ESTs 53 

116601 D80055 Hs.45140 ESTs 43 

116684 F09156 Hs36095 ESTs 73 

65 116722 F13654 HSFIH32 Stratagene cat#937212 (1992) Horn 53 

116766 H13260 Hs.95097 ESTs 53 

117453 N29568 Hs.108319 thyroid hormone receptor-associated prat 63 

117557 N33920 Hs.44532 diubiquitin 43 

117708 N45114 Hs.126280 ESTs 63 
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118001 N52151 Hs47447 ESTs 11.4 

116229 N62339 Hs.1 66254 heal shock 90kD protein 1; alpha 62 

118599 N69207 H&203697 ESTs 5.8 

1 18645 N70358 Hs.125180 growth hormone receptor 7.1 

5 118873 N89861 Hs.44577 ESTs 6 

118985 N94303 H&55026 ESTs 9.3 

119107 R42424 Hs.63841 ESTs 6 

119126 R45175 Hs.117183 ESTs 17.9 

119271 T16387 H&65328 ESTs 6 

10 119367 T78324 Hs250895 ESTs 5 

119721 W69440 Hs.48376 ESTs 154 

119741 W70205 H&43670 WnesinfamBy member 3A 10.1 

119760 W72967 Hs.191381 ESTs; Weakly similar to hypothetical pro 5.3 

120217 Z41078 Hs.66035 ESTs 4.8 

IS 120266 AA173939 H&205442 ESTs; Weakly similar to inner centromere 8.8 

120294 AA190888 Hs. 153831 ESTs; Highly similar to NY-REN-62 antige 4.9 

120416 AA236010 H&26613 Homo sapiens mRNA;cDNADKFZp586F1323{f 47 

120486 AA253400 Hs.137569 tumor protein 63 kDa with strong homotog 5.6 

120524 AA261852 Hs. 192905 ESTs 4.9 

20 120571 AA280738 Hs34892 ESTs 8.8 

120596 AA282074 HS237323 ESTs 62 

120713 AA292655 H&96557 ESTs 9.9 

120992 AA398246 H&97594 ESTs 164 

121429 AA406293 Hs.41167 ESTs 6.9 

25 121503 AA412049 H$290347 ESTs 7.6 

121512 AA412105 Hs.193736 ESTs 5.8 

121816 AA424814 H&48827 ESTs 4.6 

122027 AA431302 Hs£8721 EST; Weakly similar to N«copine [H.sapie 5.6 

122294 AA437311 H&98927 ESTs 5.7 

30 122411 AA446B59 Hs*9Q83 ESTs 65 

122791 AA460158 Hs.129836 KIAA1 028 protein 124 

122782 AA460225 K&99519 ESTs 5.1 

122969 AA478539 Hs. 104336 ESTs 4.9 

123095 AA485724 Hs27413 ESTs 54 

35 123100 AM85957 H&306219 Homo sapiens done 25032 mRNA sequence 5 

123295 AA495981 Hs250830 ESTs 4.7 

123311 AM96252 Hs.105069 ESTs 74 

123583 AA609006 Hs.11 1240 ESTs 9.1 

123619 AA609200 ESTs 4.7 

40 123645 AA609310 Hs.188691 ESTs 43 

123709 AA609651 Hs.1 12742 ESTs 7 

123968 C14333 Hs.1 08327 damage-specific DNA binding protein 1 (1 5 

124178 H45996 HsJTIOI putative G protein-coupled receptor 6 J 

124352 N21626 Hs.102406 ESTs 102 

45 124357 N22401 yw37g07.s1 Morton Fetal Cochlea Homo sap 10.6 

124515 N58172 Hs.109370 ESTs 142 

124911 R88992 Hs.174195 ESTs 4.8 

125154 W38419 ESTs 4.7 

125992 W01626 za3$e07/1 Soares fetal Over spleen 1 NF 5.1 

50 126802 AA947601 Hs.97056 ESTs 5.1 

126812 Z36290 Hs.1 73933 ESTs; Weakly similar to NUCLEAR FACTOR 1 4.6 

127080 AA662913 Hs.190173 ESTs , 5 

127308 AA507628 Hs.334390 ESTs 4,8 - 

127370 AW24352 Hs.70337 irnmunogiobuSn superfemily; member 4 4.7 

55 127386 A1457411 Hs.106728 ESTs 4.8 

127965 AA828760 HS292059 ESTs 4.8 

128172 A1400862 Hs265130 ESTs 5 

128305 AI039722 Hs279009 ESTs 5.8 

128420 AI088155 Hs.41298 ESTs; Weakly similar to unknown [H.sapie 17 

60 128467 AA176446 Hs.1 80428 ESTs; Weakly similar to hypothetical 43. 4.8 

128610 L38608 Hs.10247 activated leucocyte cell adhesion motecu 7.9 

128625 AA242816 Hs.1 02652 ESTs; WeaWy similar to KIAA0437 [H^api 8.1 

126651 AA446990 Hs.103135 ESTs 6.5 

129088 AA215971 Hs.1 94431 K1AA0992 protein 52 

65 129136 N26391 Hs250723 ESTs 5.1 

129171 AA234048 Hs.7753 calumenin 5.8 

129229 AA211941 Hs.1 09643 polyadenytate binding proteln-lnteractin 5.8 

129386 N27524 Hs260024 Cdc42 effector protein 3 52 

129467 AA410311 Hs.44208 ESTs 5.1 
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129564 H22136 Hs.75295 guanytate cyclase 1; soluble; alpha 3 16.3 

129699 AA458578 Hs.12017 K1AA0439 protein; homolog of yeast ublqu 9.2 

129821 F11019 Hs.12696 cortactin SH3 dornain-btntfing protein 85 

129823 X00946 Hs.105314 reIaxin2(H2) 9.1 

5 129847 W46767 H&296178 ESTs; WeaWy similar to RNA POLYMERASE I 5.4 

129912 AA047344 Hs.107213 ESTs; Highly similar to NY-REN-6 antigen 63 

129958 L20591 Hs.1378 annextnA3 5.1 

129977 J04C76 Hs.1395 early growth response 2 (Krox-20 (Drosop 8.6 

130061 U82256 Hs.172851 arginase; type II 7.4 

10 130241 U7B313 Hs.153203 MyoD family Inhibitor 45 

130466 N21679 Hs.180059 ESTs 53 

130541 X05608 H&21 1 584 neurofilament; light poIypepBde (68kD) 6.7 

130619 AA477739 Hs.1 2532 ESTs 6.4 

130925 N71935 Ks. 169378 multiple PDZ domain protein 7.9 

15 130938 AA013250 H&21398 ESTs; Moderately similar to PUTATIVE GLU 6.2 

130971 H20332 Hs501444 signal sequence receptor; gamma (translo 6.4 

131066 1=09006 H&22588 ESTs 5 

131126 FD9012 Hs.1 81 326 myotubularln related protein 2 6.4 

131310 J02950 H&2551 adrenergic; beta-2-; receptor; surface 7.9 

20 131487 AA253220 H&Z7373 Homo sapiens mRNA; cONA DKFZp56401763 (f 5.9 

131561 X59641 Hs£94101 pre-B-ceB leukemia transcription factor 7.6 

131562 U90551 H&28777 H2A histone family; member L 5.1 
131579 N62922 Hs£9088 ESTs 11 
131629 AA442119 H&238809 ESTs 45 

25 131682 AA428368 H&30654 ESTs 45 

131699 R68657 Hs.90421 ESTs; Moderately similar to flfl ALU SUB 65 

131795 N32724 Hs.32317 Sox-tike transcriptional factor 5.6 

132053 H93381 H&38085 ESTs; WeaJdy sfrnflar to putative glycine 72 

132122 U65092 Hs.40403 Cbp/p300-interacting transacnvaton wft 55 , 

30 132191 AA449431 Hs588361 KIAA0741 gene product 8 

132256 AA608856 Hs.431 murine leukemia viral (bmM) oncogene h 5.5 

132482 AA429478 Hs238126 ESTs; Highly similar to CGI-49 protein [ 6.6 

132533 AA021608 Hs.172510 ESTs 55 

132572 AA448297 H&237825 signal recognition particle 72kD 62 

35 132581 R42266 Hs.52256 ESTs; Weakly similar to beta-TrCP protei 16 

132700 N47109 Hs5521 ESTs 6.8 

132701 AA279359 H&55220 BCL2-associated athanogene 2 5.3 
132725 L41887 Hs.184167 spficmg factor; arginirie7serine-rich 7 75 
132783 N74897 HS278894 DEAD/H (Asp-Giu-Ala-Asp/His) boxpoiypep 55 

40 132790 X75535 Hs.168670 peroxisomal famesylated protein 8 

132939 U76189 H&611S2 exostoses (muitiple)-like 2 52 

133142 F03321 Hs.65874 ESTs 52 

133342 U29589 Hs.7138 cholinergic receptor; muscarinic 3 10.3 

133434 AA278852 Hs-30212 ESTs 5.8 

45 133453 M68941 Hs.73825 protein tyrosine phosphatase; non-recept 45 

133520 X74331 Hs.74519 piimase; polypeptide 2A(58kD) 13.1 

133544 T33873 Hs.74624 protein tyrosine phosphatase; receptor t 4.6 

133608 D13315 Hs.75207 glyoxalasel 45 

133626 H75939 Hs.75277 Homo sapiens mRNA; cDNA DKFZp586M141 (fr 5 

50 133633 021262 Hs.75337 nudeo!arphosphoproteinp130 65 

133797 S66431 Hs.76272 retinoblastoma-blndmg protein 2 6 

133928 N34096 Hs.7766 ubiauitin-conjugating enzyme E2E 1 (homo 5.4 

134095 U47414 Hs.79069 cycTmQ2 52 - 

134249 N89827 Hs50667 RALBP1 associated Eps domain containing 65 

55 134321 AA418230 Hs5172 ESTs 7 

134453 X70683 Hs.83484 SRY (sex determining region Y>box 4 4.7 

134542 X57025 Hs.85112 Insulin-like growth factor 1 (somatomedl 7.7 

134570 U66615 Hs. 172280 SVWSNF related; matrix associated; act) 6.4 

134592 U82613 Hs2891G4 AIu-bind(ng protein wifli zinc finger dom 5.4 

60 134654 W23625 Hs.8739 ESTs; Weakly similar to ORF YGR200C [S.c 5 

134666 AA482319 Hs.8752 putative type II membrane protein 5.4 

134606 Z49099 Hs59718 spermine synthase 6.7 

134951 AA431480 Hs.1 69358 ESTs 95 

135066 X04602 Hs53913 interieukin 6 (interferon; beta 2) 5.7 

65 135155 AA356268 Hs.166556 ESTs; Moderately similar to transcriptio 45 

135411 L10333 Hs.99947 reticuton 1 55 

300023 M10098 AFFX control: 1BS ribosomaJ RNA 4.6 

300254 AW079607 H&55610 ESTs; WeaJdy similar to ZnT-3 [Haptens 75 

300273 AW013907 Hs.167531 ESTs; Moderately similar to predicted us 115 
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300319 AW157646 Hs.153506 ESTs; Weakly similar to mlcrotubute-acti 85 

300566 H86709 Hs^26392 sonof8evenl8Ss(Drosophlla)homolog1 5 5 

300578 AI989417 Hs.134289 ESTs 4.4 

300671 AI239706 Hs.93810 ESTs 7.9 

5. 300675 AA039352 Hs. 125034 ESTs; Weakly similar to ORF YDUMOc [S.c 45 

300680 AW468066 H&24817 ESTs; WeaMy similar to KIAA0986 protein 52 

300762 AI497778 Hs2Q5Q9 ESTs 6.4 

300810 AI076890 Hs.146847 ESTs 55 

300813 AA406411 H&208341 ESTs; WeaWy similar to KIAA0989 protein 105 

10 300823 A1863068 Hs.106823 ESTs; WeaWy similar to putative zinc fi 55 

300834 AF109300 Hs.147924 ESTs 6.7 

300923 AW136372 Hs.1852 ESTs 75 

300962 AA593373 Hs.293744 ESTs 55 

301015 AA947682 HS.2Q252 ESTs; WeaWy simflar to Chain A; Cdc42hs 7 

15 301042 A1659131 Hs.197733 ESTs 245 

301242 AW161535 Hs23782 ESTs 115 

301254 A1049624 H&283390 EST cluster (not in UniGene) wtih exon h 45 

301262 H29500 Hs.7130 ESTs;l^eiaterysirrJlartorfcoplne[H. 45 

301388 AA156879 H&262036 ESTs; WeaWy similar to ZINC FINGER PROT 65 

20 301563 A1802946 Hs.44208 ESTs; Weakly similar to match to ESTs AA 5.7 

301656 AW008475 Hs.151258 EST duster (not in UnlGene) with exon h 65 

301689 244810 Hs.301789 ESTs; WeaWy similar to simHar to Oete 65 

301783 AL046347 H$53937 Homo sapiens PAC done DJ1 159004 from 7p 6.2 

301805 AI800004 Hs.142846 ESTs; WeaWy similar to MesP1 [M jnuscutu 85 

25 301846 R200Q2 Hs5823 ESTs; WeaWy similar to Intrinsic factor 45 

301891 AF131855 H&278591 Homo sapiens done 25058 mRNA sequence 65 

302005 A1869666 Hs.123119 ESTs 365 

302056 AI457532 Hs.30488 ESTs; Moderately similar to ROSA26AS [M. 95 

302067 H05698 H &22239 9 ESTs; Weakly simttar to protein-tyrosins 55 

30 302099 AL021397 Hs.137576 ribosomal protein L34 pseudogene 1 85 

302147 AB022660 Hs.151717 WAA0437 protein 5.9 

302214 AJ001454 Hs.1 59425 Homo sapiens mRNA for testican-3 45 

302236 AI128606 Hs5557 zinc finger protein 161 4.3 

302358 D81150 Hs522848 EST duster (not in UniGene) with exon h 55 

35 302410 NM.004817 Hs218366 EST duster (not in UniGene) with exon h 265 

302486 AC003682 Hs.183512 multiple UniGene matches B2 

302582 NMJXXK22 Hs249195 EST duster (not in UniGene) with exon h 6.4 

302785 AA425562 Hs.11065 EST duster (not in UniGene) with exon h 5 

302792 AA343696 Hs.46821 ESTs; WeaWy similar to putative [H^api 45 

40 302881 AA508353 Hs.1 05314 relaxn 1 (H1) 785 

302892 N58545 Hs.42346 histone deaoetyiase 3 85 

302970 AW118352 Hs512679 EST duster (not in UniGene) with exon h 7.4 

302977 AW263124 Hs515111 EST duster (not in UniGene) with exon h 55 

303029 AF199613 EST duster (not in UniGene) with exon h 45 

45 303125 AF161352 Hs.1 11782 EST cluster (not in UniGene) with exon h 55 

303280 AI571580 Hs.170307 ESTs 45 

303306 AA215297 Hs.61441 EST duster (not to UniGene) with exon h 6.4 

303309 AL134164 Hs. 14541 6 ESTs 65 

303344 AA255977 Hs250646 ESTs; Highly similar to uWquffin-conjug 195 

50 303380 AA298471 Hs528567 EST duster (not fn UniGene) with exon h 65 

305401 AA758552 Hs509497 ESTs 65 

303525 AW516519 Hs273294 ESTs 45 

303526 AA348111 Hs.86900 ESTs 12.1 - 
303540 AA355607 Hs.309490 ESTs; WeaWy similar to MMSET type I [H. 82 

55 303572 AW338520 Hs242540 ESTs 8.4 

303685 AW500106 H&23643 EST duster (not in UniGene) with exon h 4.9 

303699 D30891 Hs, 19525 EST duster (not in UniGene) with exon h 15.7 

303702 AW500748 Hs224961 ESTs; Weakly similar to 73 kDA subunft o 65 

303718 AI741397 Hs.1 14658 ESTs 45 

60 303722 AA521510 Hs.145010 ESTs 125 

303732 AW502405 Hs.125759 ESTs; Weakly similar to tumor suppressor 4.3 

303735 AA707750 Hs.169055 ESTs; Weakly similar to ds-Gotgi matrix 5.4 

303752 AI017286 Hs5957 EST duster (not in UniGene) with exon h 55 

303753 AW503733 Hs.9414 ESTs 13 
65 303813 AI275850 Hs.1 14658 EST duster (not in UniGene) with exon h 75 

304053 R00493 Hs.1 25565 transtocasa of inner mitochondrial membr 45 

304218 N66373 Hs27973 ESTs; Weakly similar to ZK354.7 [Celega 6 

305200 AA668128 Hs.45207 EST singleton (not In UniGene) with exon 5.7 

306716 AIQ24916 Hs251354 ESTs 5.7 

126 



WO 02/30268 



PCT/US01/32045 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



307848 AJ 3641 86 


EST singleton (not in UniGene) with exon 


7.3 


307871 A) 368865 


Hs51 476 EST singleton (not in UniGene) with exon 


54 


308050 A1460004 


Hs.31608 EST singleton (not In UniGene) with exon 


8.1 


308362 AI613519 


Hs. 105749 EST singleton (not in UniGene) with exon 


5f 


308923 A! 663051 


Hs579815 ESTs 


AA 


309116 A1927149 


Hs 29797 ribosomal protein L1 0 


AS 


309375 AW075342 


Hs.9271 EST singleton (not in UniGene) with exon 


7A 


309674 AW205604 


Hs.266009 ESTs; WeaJdy similar to illi ALU SUBFAMJ 


5 


310095 AI921750 


Hs.144871 ESTs 


5 


310098 AJ 685841 


Hs.161354 ESTs 


11.6 


310250 AI478629 


Hs 158465 ESTs 


5.8 


3103S5 Ai26214fi 


Hs, 145569 ESTs 


9.7 


310382 AI734009 


Hs.127699 EST cluster (not in UniGene) 


104 


310409 AI612775 


Hs.145710 ESTs 


4.6 


310431 A1420227 


Hs.149358 ESTs 


725 


310573 AW292180 


Hs.156142 ESTs 


7.6 


31Q59S AI338013 


Hs.140546 ESTs 


92 


310639 AW269082 


Hs.175162 ESTs 


AS 


310787 AW262580 


Hs.1 47674 ESTs 


AS 


310816 AI973051 


H&224985 ESTs 


7j6 


311251 Alfi556fi2 


Hs.197698 ESTs 

I lv* Wi VvV WW ■ w 


41^ 


311980 AI7B7957 


Hs.1 98248 ESTs: Weaktv simitar to Y38A8.1 oene pro 


A5 


0 1 lOOv nJD/ 3Jt*t 


H<l201 629 ESTs* Moderatah/ similar to 111 1 ALU SUB 


4.6 


111*51*? AW138713 

0 11013 rvfllOO/IO 


Ha238fi2 ESTs 

1 IMMOUb LM 1 a 


5^ 


<111K74 AIR94Rfi3 


H&211420 ESTs 

1 loir 1 ItbV ww I o 


43 


QH1CQ7 AIQOQOCA 
01 190/ WOwlaw 


H&27101& ESTs 


&& 


41 1 cor AjcQonoa 

O 11090 rUOOCUOO 


Hs.7fl37S ESTs 


264 


311 R31 AlftflQ51Q 

01 1001 n!0U9OI9 


H&27133 ESTs 


64 


311fiftft AW025681 

OtlOOO nilvulDu l 


* H&24009D ESTs 


74 


3117R3 AI682478 

OII/OO HlOOtHfO 


Hs.1 3528 EST 


4.6 


311396 AA765470 

OIIO£u tv\l Wt/W 


Hs.85092 ESTs 


6.7 


'MIRW AWM4M3 


He 107056 PSTs 


53 


311001 R 16890 
oj iwi nioow 


Hs.1 37135 ESTs 


5.6 


oil 30c «W*K>IOO*t 


Hs 957482 ESTs 


4.3 


OlclOO Mwastau 


Hs.1 18695 cytochrome b-561 


11 


319139 AA3343nfl 


Hs39_B263 EST duster (not in UniGene) 


165 


319949 AI3flf)9fJ7 


He 195276 ESTs 


4.7 


41090ft Pf113fi7 
OI&ZoO wvlOO/ 


He 19719ft FSTq. 
no. it/ ito uOlo 


5,3 


419Afl7 DZft13H 


Hs 1*13485 ESTs 


B2 


319494 AA34739S 


Hs291997 ESTs 


4.6 


419A95 RACMRQ 
OltnitO rVKMOO 


Hs 9R3fi99 ESTs 


52 


41940ft DCQRC1 


He 144397 ESTs 
no. iH*»3*j/ eo i a 


95 


319518 C17785 

OI£3IO VI//O0 


Hs.182738 ESTs 


65 


410W1 AA033f;OQ 


He 933R34 ESTs 


112 


419097 AIROAR99 


HR.1Q1271 ESTs 

no»l9l£/ 1 bwlO 


4.7 


OI&O09 WWW// 


He 900360 ESTs 


7 


419C4fi AIR94C11 
OltOMO /UQCOOII 


He 11RRR7 PQTe 
no. i iooo/ co i o 


5.1 


OI&ODO nA9/OUOn 


Hq 1R0842 ESTs 

nOtlOwOHfe E»QIO 


65 


419A99 AARQA£fl7 
Ol£OiS0 /v\D9*+0U/ 


He 17fiQ56 EOT riiLetflr /nnt In UnlQfine) 


105 


Q19QK7 AA77997G 


He193Q14 F3Te 

110.160914 CO ID 


5 


fJIOOQn AtQIQfiE/1 

31Z8SKJ Aloioo54 


nsjyo/ to is 


55 


4"f OOH4 A AQ^QOA^ 

ol&bvo /vwoatoo 


nox/ootD co i o 


7.7 


312905 H92571 


HS234478 ESTs 


65 - 


312976 AA836271 


Hs.125830 ESTs 


45 


312983 AJ07S278 


Hs^69899 ESTs 


5.1 


312996 AA249018 


Hs.154331 EST cluster (not In UniGene) 


7 


313035 N36417 


Hs.1 44928 ESTs 


65 


313166 AJ801098 


Hs.151500 ESTs 


45 


313188 AI039702 


Hs.179573 collagen; type 1; alpha 2 


45 


313218 AA827805 


Hs.124296 ESTs 


5 


313226 AI200281 


Hs.123910 ESTs 


55 


313325 AI420611 


Hs.127832 ESTs 


45 


313326 A1088120 


Hs 122329 ESTs 


74 


313425 AA745689 


Hs.186838 ESTsiWeaWysimilartosimBartozinc 


6.3 


313499 AJ2613S0 


Hs.146085 ESTs 


55 


313540 AI797301 


HS5740 ESTs 


5.9 


313568 AW467376 


Hs.1 29640 ESTs 


45 


313569 AJ273419 


Hs.135146 EoTs;Wea)dystm«artoZK1058^[C.eteg 


4.6 


313603 AW468119 


HS587631 EST duster (not in UniGene) 


6.8 
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AI391470 


Hs.158618 ESTs 


5^ 
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Hs.189413 ESTb 


5 
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AA679430 


Hs.191697 ESTs 


5.7 


315990 


AI800041 


Hs.190555 ESTs 


92 


316012 


AA764950 


Hs.119838 ESTs 


4.3 


316036 


AA708016 


Hs.130389 ESTs 


5.9 


316055 


AA693680 


Hs.6947 EST duster (not In UniGene) 


6.7 


316074 


AW517542 


HS293273 ESTs 


55 


316100 


AW203986 


Hs^13003 ESTs 


5.1 


316169 


Al 127483 


Hs.120451 ESTs 


82 


316442 


AA760894 


Hs.153023 ESTs 


17.1 


316491 


AA766025 


Hs.186854 EST 


45 
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AW135854 


Hs.132458 ESTs 
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316854 AA831215 Hs.159066 EST* WeaWy similar to predicted using 5.1 

316905 AW138241 Hs51Q846 ESTs 6.4 

317006 AW051597 Hs.143707 ESTs 4.4 

317019 AAS64968 Hs. 127699 ESTs 11 

5 317194 AW445167 Hs.126036 ESTs 135 

317224 056760 Hs53029 ESTs 8.7 

317404 AI806867 Hs.126594 ESTs 8.7 

317501 AA931245 Hs.137097 ESTs 11.1 

317548 AI654187 Hs.195704 ESTs 145 

10 317651 AW292779 Hs.169799 ESTs 55 

317758 AI733277 Hs.128321 ESTs 5.4 

317850 N29974 Hs.152982 EST duster (not in UniGene) 11.4 

317869 AW295184 Hs.129142 ESTs; Weakly similar to DEOXYRIBONUCLEAS 135 

317902 AI828602 Hs511265 ESTs 5.3 

15 317916 AI565071 Hs.159983 ESTs 7.7 

318239 AI085198 Hs.164228 ESTs 13.1 

318268 AI817736 Hs.182490 ESTs 6.2 

318327 AW294013 Hs50Q942 ESTs 4.6 

318363 R45530 H&1440 gamma-amlnobutyricadd (GABA) Arecepto 6 

20 318428 A1949409 Hs.194591 ESTs 125 

318464 A1151010 Hs.157774 ESTs 4.3 

318524 AW291511 Hs.159066 ESTs 255 

318540 T3Q280 H&274803 EST duster (not in UniGene) 7 

318591 AW206806 Hs.1 15325 ESTs 4.8 

25 318615 AI133617 Hs.10177 ESTs 55 

318646 AW175665 Hs578695 ESTs 5.7 

318667 AI493742 Hs.165210 ESTs 11 

318668 W26276 Hs. 136075 ESTs 5.9 
318753 AA578265 Hs.7130 copmelV 55 

30 319080 Z45131 Hs53023 ESTs 165 

319181 F06504 H&27384 EST cluster (not in UniGene) 4.6 

319191 AF071538 Hs.79414 prostate epithelium-specific Ets transcr 6.6 

319233 R21054 Hs.180532 ESTs 45 

319586 D78806 H&283683 ESTs 8.2 

35 319750 AA621606 Hs.1 17956 ESTs 95 

319763 AA460775 Hs5295 ESTs 145 

319824 AA424266 Hs.1 23642 EST cluster (not in UniGene) 125 

319838 AA337642 Hs55262 nudear factor related to kappa B bindin 5.1 

319913 AA179304 Hs571586 ESTs; Moderately similar to 111! ALU SUB 45 

40 319954 7180579 H&29Q27D ESTs 5.8 

320076 A1653733 H&271593 ESTs 85 

320102 AW296219 Hs.1 15325 RAB7; member RAS oncogene family* 1 95 

320187 T99949 Hs503428 EST duster (not in UniGene) 95 

320211 AL0394Q2 Hs.125783 DEME-6 protein 75 

45 320324 AF0712Q2 Hs.1 39336 ATP-binding cassette; suMarnByC(CFTR 562 

320455 R49889 H&24144 EST duster (not in UniGene) 85 

320464 AI089817 H&237146 ESTs 5.4 

320561 NMJW6953 Hs.1 59330 EST cluster (not in UniGene) 7 

320574 AL049443 Hs.1 61283 Homo sapiens mRNA; cDNA DKFZp5B6N2Q20 (f 44 

50 320576 AL049977 Hs.1 62209 Homo sapiens mRNA; cDNA DKFZp564C122 (fr 6.7 

320654 AW263086 Hs.1 181 12 ESTs 6 

320796 AF038966 Hs51218 secretory carrier membrane protein 1 135 

320800 AI681006 Hs.71721 ESTs 65 - 

320813 AW360847 Hs.1 6578 ESTs 95 ' 

55 320853 AI473796 Hs.1 35904 ESTs 8.1 

320856 D59945 Hs55366 EST duster (not in UniGene) 6 

320899 AA633772 Hs.1 16796 ESTs 95 

320918 AW195012 Hs593970 ESTs 5 

320973 H19732 Hs547917 ESTs 55 

60 321099 AA018386 Hs.64341 ESTs 4.6 

321190 H52462 Hs.163872 EST cluster (not in UniGene) 5.8 

321318 AB033041 Hs.137507 EST duster (not in UniGene) 8.4 

321382 AW372449 Hs.1 75982 EST duster (not in UniGene) 75 

321441 AW297633 Hs.11B498 ESTs 14.7 

65 321538 H80483 Hs.46903 EST duster (not In UniGene) 95 

321609 H86Q21 Hs.182538 ESTs; WeaWy similar to hMmTRAlb [H.sapi 45 

321636 AI791838 - Hs, 193465 ESTs 55 

321638 A1356352 Hs.108932 ESTs 45 

321644 A1204177 H&537396 ESTs 65 
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25 
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65 



321681 


AA233821 


Hs.190173 

I H>* 1 VW If W 


EST duster 1 no! in UniGene 1 


4.6 


321726 


X91221 


Hs.144465 


EST duster (not in UnfGene) 


5 


321758 


U29112 


H&196151 


EST duster (not in UniGene) 


62 


321877 


AL1097B4 


Hs.189222 


EST duster fnnt in UniGene) 


4.6 


321899 


N55158 


H&29468 


ESTs 


4.6 


321 802 


AA746374 


Hs.145010 


ESTs 


8.2 


322007 


AW410646 


Hs. 164649 


ESTs 


5.1 


322055 


AL137B46 


Hs.146001 


EST duster (not in UniGene) 


4.3 


322092 


AF085833 


Hs.135624 


EST duster (not in UniGene) 


43 


322221 


AI890619 


Hs.178662 


find Ansfi iha AccAniTiiV n rot Am i*IikA 1 


4.4 


322278 


AF086283 




EST duster (not in UniGene) 


5.8 


322303 


W07459 


Hs. 157601 


EST duster fnof in UnfGenei 


22 


322437 


AW393804 


Hs. 170253 


EST s* Waaktv similar to rabaotln-4 n-Lsa 


4.4 


322493 


AF143235 


Hs379819 


EST cluster (not in UniGene) 


72 


322762 


AA056060 


H&2Q2577 


EST duster (not in UniGene) 


164 


322811 


AA782292 


Hs.105872 


ESTs 


65 


322818 


AW043782 


H&293616 


ESTs 


10.7 


322826 


A1807883 


Hs.180059 


ESTs 


5 


322887 


AI986306 


Hs.86149 


ESTs; Weakly similar to KIAAQ969 protein 


11J9 


322889 


AA081924 


Hs.124918 


ESTs 


7.1 


322924 


AA6S92S3 


Hs.136075 


ESTs 


45 


322982 


A1351191 


Hs.128430 


ESTs 


6.6 


322994 


AA422116 


Hfi.191461 

1 19. 10 1 "fU 1 


ESTs 


47 


323040 


AA336BQ9 


Hs.10862 


ESTs 


65 


393041 


AL1 18747 


H&26691 


EST duster (ntit in UniGene. 


83 






Ue 188838 


ESTs 


43 


323048 


AL1 18923 


Hs.175110 


EST duster (not in UniGene) 


75 




AA1 57728 

Wig/ J£Q 


Hs364330 


ESTs 


7& 


323071 


AA1 57867 


H&5722 


ESTs 


47 


323097 


244354 


He9fl6281 


amnlna nudfloKda hfnrfinn nrafnin /G nr 


4.9 


323131 


AA176982 


HS270124 


EST duster (not in UniGene) 


6.1 


323136 


AL120351 


H$30177 


EST duster /not in UniGene) 

tvl WU9U3I \IIUIIM uiuvroiitsy 


43 


323175 


AI827137 


Hs336454 


ESTs 


6.2 


3£0£ 10 


AF131848 


Hs. 13398 
nth i<M«o 


Hnmn eanfone pinna 94)98 mRM4 epnimnm 


63 


393996 


AF055019 




Hnnm efiniene rinno 94870 mRM A epniwnro 


•J2J3 


323236 


AA3S3148 




ESTs 


105 


323262 


AI89Q77n 


Hs. 190642 


ESTs 


76 


323276 


AAB3B459 


Hs323822 


pCTe 
Wis 


76 

f JO 


393987 


AA639902 


Hs.104215 


ESTs 


247 


323335 


AI 655499 


Hs.161712 


ESTs 


14.1 


393341 


AL1 34875 


He 10884ft 


ESTs 


53 


323362 


41 135087 


He 117189 


FffTe 
BOla 


8 1 

D.I 




C05978 


He9flQ991 


ESTs! Modflmtfllv ejmnarto iPYRUVATE DP 


8.5 


O&OnOU 


AtftPfiftfrt 
raocuowi 


H&300700 


CCTc 
W 1 o 


4^ 


323507 


H71721 


Hs.128387 

1 lO. IMMVI 


ESTs 


4^4 




AI814405 


He994569 


ESTs 


CO 

OA) 


323623 


AA314980 


Hs.146589 


EST duster frtnf in 1 JnlGAnn) 


5 


Q9QAAQ 


AW983£9ft 


He 943094 


CCTe 
bois 


77 


323691 


AA317581 


He 14I18QQ 


EST duster fnnt in LiniGfinn) 

Col UUMOI ^llUt UI UiUUttnlsJ 


3-« 


G&OOIU 


AA74040K 


He 108808 
no. iuoduo 


ESTs 


6.2 




/vVjO/wli 


He 137831? 

[JO. 10/033 


PCTe 
CO IO 


A 
U 




/vVXJHoW 




CCTc 
CO 1 a 


107 


323959 


AI635775 


HS.6831 


ESTs 


5.4 


323996 


AA367032 


Hs317882 


ESTs 


53 


323997 


AA844907 


H&274454 


EST duster (not in UniGene) 


4.4 


324019 


AW177009 




EST duster (not in UniGene) 


43 


324130 


AL046575 


Hs.130198 


ESTs 


11 


324295 


A1146686 


Hs.143891 


ESTs 


137 


324296 


AJ524039 


Hs.192524 


ESTs 


63 


324307 


AA627642 


Hs.4994 


transducer of ERBB2; 2 (T0B2) 


43 


324330 


AA884766 




EST duster (not in UniGene) 


43 


324385 


F28212 


H&284247 


EST duster (not in UniGene) 


47 


324430 


AA464018 


Hs.164598 


EST cluster (not in UniGene) 


133 


324452 


AW014022 


Hs.170953 


ESTs 


7.6 


324547 


AW501974 


Hs.74170 


ESTs 


5.6 


324603 


AW016378 


HSJ292934 


ESTs 


242 


324617 


AA508552 


Hs.185839 


ESTs • 


54 


324618 


AJ346282 


HS37159 


ESTs 


43 


324620 


AA448Q21 


Hs.94109 


EST cluster (not in UniGene) 


57 
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324696 
324713 
324715 
324718 
324720 
324752 
324753 
324790 



324804 
324645 
324888 



325108 
20 326816 



327098 
328492 
329362 
25 329929 
329960 
330020 
330211 



AI685464 ESTs 
AI694767 Hs.129179 ESTs 
AW503943 Hs.1 12451 ESTs 

A1217983 Hs293341 ESTs; Weakly similar to Pro-a2(Xl) [H.sa 
AA641092 Hs257339 ESTs 
AW340249 Hs. 163440 ESTs 
AI739168 Hs.131798 EST cluster (not m UnlQene) 
AI557019 Hs.1 16467 ESTs 
AA576904 Hs292437 ESTs 

A1279919 H&272072 ESTs; Moderately similar to fill ALU SUB 
AA612626 Hs.1 44871 EST cluster (not in UnKtene) 
AJ334367 Hs.159337 ESTs 
AI819924 Hs.14553 ESTs 
AI692552 ESTs 
AA361016 Hs537533 ESTs 
A1564134 Hs.136102 WAA0853 protein 
A1741633 Hs.125350 ESTs 
AA613792 EST cluster (not in UniGene) 

AA401883 HS22380 ESTs 

CH20_hsgt|6552458 
CH21_hsgi|5867660 
CH21Jisgl|6682516 
CH.07_hsgi[5868455 
CHJLhsgil5868837 
CH.16JJ2 #165201 
CH.l6j2gfe091894 
CH.16_p2gi|6671887 
CH.05_p2gi|6013592 
androgen receptor (dihydrotestosterone r 
H&321110 

guanine nucleotide binding protein 4 
hepatocyte nuclear factor 3; alpha 
ESTs 
ESTs 
ESTs 
ESTs 

ESTs; Moderately similar to kynurenlne a 
ESTs 

Hs24052 ESTs; WeaWy similar to IU1 ALU SUBFAMJ 
Hs55254 ESTs 

Hs.1 5251 Human DMA sequence from clone 437M21 on 
Hs.143187 FK506-btnding protein 3 (25kD) 
Hs.11356 ESTs 
EST 

Hs.91202 ESTs 
Hs.142896 ESTs 
Hs5151B1 ESTs 
Hs.108920 ESTs 
Hs.14846 ESTs 
Hs26B714 ESTs 
Hs268838 ESTs 
T64447 Hs.168439 ESTs 
AA262999 H&300141 ESTs 
AA276355 Hs.87929 ESTs 
AA287662 H&1 18630 ESTs 
AA400596 Hs58143 ESTs 
AA416979 Hs51897 ESTs 
AA454543 Hs.43543 ESTs 

F10802 Hs237339 ESTs; Moderately similar to Hi! ALU SUB 
H77381 Hs41223 ESTs 
N21680 Hs.43455 ESTs 
N27154 Hs.44076 ESTs 

N32912 Hs291039 ESTs; Weakly similar to hypothetical 43. 
N34357 Hs.93817 ESTs 
N62780 Hs.48703 ESTs 
N92352 Hs5472 ESTs 
W48868 H&334305 ESTs 
Z38907 Hs.65949 KIAA0888 protein 
331811 AA404500 Hs.187958 ESTs 
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330430 
330546 
330551 
330658 
330700 
330704 
330705 
330706 
330712 
330725 



AA319514 
AAD37415 
AA056557 
AA102571 
AA121140 
AA167269 
AA252033 
40 330732 AA281092 
AA449677 
AA450200 
AA479114 

D60374 

45 330892 AA149579 
H01458 
H20826 
N24619 
R36671 
R51361 



HG2261-HT2352 
U31382 H&299867 



Hs50732 

Hs20999 

Hs.6759 

Hs.157078 

Hs.177576 



330763 
330772 
330786 



330949 
330977 
331017 
331099 
331128 
331151 
331195 
331320 
331321 
331337 
331348 
331359 
331383 
331422 
331442 
331466 
331479 
331490 
331493 
331561 
331615 
331659 



9 

22 

4.9 

10.6 

102 

5.5 

72 

34.4 

4.8 

7.9 

52 

7.6 

12.6 

65 

4.5 

4.4 

6.5 

5.1 

7.1 

95 

A3 

A3 

55 

45 

55 

7.6 

6 

12.6 
9 

Antigen, Prostate Specific, Aft. Splice 
6 

4.9 
6 

55 

5.1 

11.7 

145 

5 

72 

4.9 

185 

A3 

S3 

A3 

15.3 

10.3 

4.4 

115 

115 

4.8 

13 

4.9 

4.8 - 

6.1 

92 

9.9 

4.3 

4.6 

45 

75 

54 

65 

125 

45 

92 

4.6 

8.7 

105 

45 
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331848 AA417039 Hs.98268 signal recognition particle 72kD 75 

331873 AA429445 H8.98640 ESTs 65 

331889 AA431407 Hs.98802 Homo sapiens Chromosome 16 BAC done CIT 335 

331967 AA460158 Hs59589 KIAA1028 protein 65 

5 331974 AA464518 Hs.105322 ESTs 55 

332043 AA490831 Hs201591 ESTs 105 

332076 AA599477 Hs291156 ESTs 4.4 

332173 F09281 Hs.100725 ESTs 55 

332247 N58172 ESTs 142 

10 332249 N62096 Hs.194140 ESTs 72 

332325 T79428 Hs.339667 ESTs 55 

332396 AA340504 ESTs; Weakly similar to stmDarto human 212 

332434 N75542 Hs237731 transcription factor 4 155 

332493 N95495 Hs56729 ESTs; Highly similar to GTP-blndlng prat 7.1 

1 S 332522 158503 Hs.178357 glutathione S-transferase theta 2 65 

332526 AA281753 Hs.17731 inositol 1 ^triphosphate receptor; ty 55 

332530 M316B2 Hs.19280 inhibin; beta B (acthrfn AB beta polyp ep 55 

332533 M99487 Hs.325625 folate hydrolase (prostate-specific memo 38.1 

332538 N48715 Hs20991 ESTs 65 

20 332546 D84454 Hs22587 solute carrier family 35 (UDP-galactose 45 

332594 AA279313 H&32951 methyl CpG binding protein 2 55 

332610 AA412405 Hs.40513 ESTs; Weakly similar to BETA GALACTOSIDA 5.6 

332661 N9S742 Hs5390 ESTs 65 

332697 T94885 Ha.75725 carboxypeptidase E 245 

25 332712 D26070 Hs.79306 inositol 1 ^triphosphate receptonty 95 

332716 L00058 Hs.79630 v-myc avian myelocytomatosls viral oncog 55 

332726 R72029 Hs53428 synaptophysto-fike protein 5 

332781 AA233258 ESTs; Weakly similar to 01 0075 [Cetega 45 

332797 CH22_FGENES.6J 305 

30 332798 CH22_FGENES5_5 685 

332799 CH22J=GENES.6_6 195 

332933 CH22_FQENES58_7 55 

332980 CH22_FGENES54J 55 

332984 CH22LFGENES54J 45 

35 333168 CH22_FGENES54_1 4.7 

333169 CH22_FGENES54_2 AA 

333452 CH22_FGENES.157J 45 

333456 CH22_FGENES.157_5 45 

333458 CH22_FGENES.157_7 45 

40 333611 CH2^R3BES217_6 4J 

333621 CH22J=GENES219_5 55 

333814 CH22_FGENES282_2 7.1 

333849 CH22_FGENES290_8 62 

333949 CH22_FGENES503_5 45 

45 333951 CH2^FGBJES503_7 45 

333955 CH22_FGENES503_11 5.6 

334150 CH22.FGENES539J 5.1 

334223 CH22_FGENES560_4 205 

334297 CH22_FGBIES572_3 9.4 

50 334443 CH22_FGENES587_2 4.6 

334444 CH22J=GENES587_4 5.6 

334447 CH22JGENES587J 13.1 

334570 CH22_FGENES.405 J 1 5.4 - 

334749 CH22_FGENES.427_1 55 

55 334777 CH22_fGENES.430„9 4.7 

334960 CH22_FGENES.465_29 52 

335179 CH22_FGENES504_9 1 85 

335293 CH22_JGENES527J 4.7 

335550 CH22_FGENES576_11 5.1 

60 335581 CH22_FGENES581_19 5.7 

335586 CH22_FGENES581_25 45 

335809 CH22LFGENES517J 62 

335810 CH22.FGENES517.7 55 
335822 CH22_FGENES519_7 7.1 

65 335824 CH22.FGENES.619J1 85 

335853 CK22_FGENES.626_5 45 

335886 CH22_FGENES.632_4 45 

336034 CH22.FGENES578.5 6.8 

336441 CH22.FGENES527.7 75 
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336624 


CH22J=GENES.6-3 


433 


336625 


CH22_FGENES.64 


375 


336679 


CH22_FGENES.43-7 


5.3 


337577 


CH22 C65E1.GENSCAN3-1 


4$ 


338255 


CH22.B/IAC005500.GENSCAN276-3 


134 


338260 


CH22_By1^C005500.GENSCAN279-10 


4.6 


338561 


CH22 EMAC005500.GENSCAM21-5 


4.6 


338562 


CH22_EM^C005500.GENSCAN.421-6 


42 


338759 


CH22_EM^C005500.GENSCAN^17-6 


5.1 


338763 


CH22^EM^C005500.GENSCAN51M6 


5.5 


338764 


CH22_EM^C005500.GENSCAN51M7 


7.1 
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TABLE 3A shows the accession numbers for those primekeys lacking unigenelD's for Table 
3. For each probeset we have listed the gene cluster number from which the oligonucleotides 
were designed. Gene clusters were compiled using sequences derived from Genbank ESTs 
5 and mRNAs. These sequences were clustered based on sequence similarity using Clustering 
and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers 
for sequences comprising each cluster are listed in the "Accession" column. 

10 Pkey: Unique Eos probeset identifier number 

CAT number Gene cluster number 

Accession: Genbank accession numbers 

IS Pkey CAT number Accession 

123619 371681J AA602864 AA609200 

116722 143512 J Z24878 AA494098 F13654 AA494040 AA143127 

103677 41847 1 Z83806 AJ132091 AJ132090 

20 125992 1589048.1 H48372W01626 

109342 g*nbantAA21362Q AA213620 

125154 ganbank_W38419 W38419 

101447 entrez_M21305 M21305 

124357 genbanl_N22401 N22401 
25 108910 genbanLAA136590 AA136590 

322278 47271J W69304AF086283 W69200 

315084 350959J AI821085 AW973464 AA554802 A1821 B31 AA657438 AA640756 AAS50339 

324019 262792.1 AW177009 AI381610 

324330 300543.1 AA884766 AW974271 AA592975 AA447312 

30 324626 336411.1 AI685464 AW971336 AA5 13587 AA5251 42 

303029 37699 1 AF1 99613 AF 108756 

324804 398093.1 AI692552 AJ393343 AI800510 AJ377711 F24263 AA661876 

324961 376239.1 AA613792 A W1 82329 T05304 AW858385 

329362 cj_hs 
35 336624 ChB2.4071FGJL3_ 

336625 CH22_4072FG__6_4_ 

336679 CH22_4157FG_43_7_ 

338255 CH2?_6856FGL_LtNK_EMAC00 

338260 CH22.6863FG_LINK^Brf:AC00 
40 329329 C16J32 

329960 c16jp2 

338561 f^22.7294FGL_.UNlLEMAC00 

338562 CH22_7295FG_UNK_EMAC00 
338759 CH22J581FG_UNK_EM:AC00 

45 338763 CH22 7585FG_UNK»EMAC00 
338764 CH2^7586FCUUNK_EMAC00 

333168 CH22.400FG.64.1.UNieEMA 

333169 CH22.401 FG_94_2_UNK U EM:A 
333452 CH22.702FGJ57J.LINieEM: 

50 333456 CH22 706FGJ57_5_UNieEM: 
r 333458 CH22_708FGJ57_7_UNieEM: 

333611 CH2^.872FGJ17J.UNK_EM: 

333621 CH22^882R=L219.5JJNK_EM: 

333814 CH22_1083FGJ282J?.UNK.EM 
55 ' 333849 CH22_1118FG - 290.8_LINK„EM 

335179 CH22.2515FG.504_9_UNK_EM 

333949 CH22_1225FG.303_5.UNK.EM 

333951 CH22J227FG__303_7JJNK_EM 

333955 CH22J231FG.303J 1JJNK.E 
60 '335293 CH22_2635FG_527_6_UNK_EM 

326816 c20.hs 

326997 c21_hs 

335550 CH22_2905FG.576 J 1JJNK.E 
335581 CH22j2938FG.581_19JJNK w E 
65 335586 CH2?J944I=G.581 J5.UNK.E 
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32B492 C_7_hs 

335809 CH22_3181FQJ17_6J.INK.EM 

335810 CH22J182FQJ17J_UNK.EM 
335822 CH22J195FG 619 7 UNK_EM 

5 335824 CH22_3197FG_619J1_UNK.E 
335853 CH22_3228F<L626AJ.INieEM 
335888 CH22_3261FQ_632_4_UNieEM 
330020 c16_p2 
330211 c_5j>2 
10 337577 CH22_5864FG_JJNK_C65E1 .G 
307348 AI364186 

332797 CH22J3FQ_6JJ.INK_C4Q1.Q 

332798 CH22J4FG_e_5_UNK_C4G1.G 

332799 CH22J5FCL6_6JJNK_C4G1.G 
15 334150 CH22.1429FCL339JJJNK.EM 

332933 CH22J54FG_38_7J.INieC20H 
332980 CH22J04FG_54_1_UNK_EM:A 
332984 CH22_208FG_54_6_UNK_EM:A 
334223 CH22J507FGj60_4JJNfeEM 
20 334297 CH22J588FG_.372_3JJNK._EM 
327098 c21JlS 

334443 CH2^1742FG^3S7JLUNK_EM 

334444 CH22J743FG_387_4JJNK_EM 
334447 CH22J746FG_387_7JJNK_EM 

25 334570 CH22J875FG_405_11_UNK_E 
334749 CH22J061 FG_427_1_UNK_EM 
334777 CH22_2089FG_430_9 _UNK_EM 
338034 CH22 3419FG 678J5JJNKJ>J 
334960 CH22_2281FG_4S5_29_UNK_E 
30 336441 CH22_3861FG_827_7JJNKJW 

330551 9851.2 U39840 NM 004496 AW1 35607 BE087458 BE087567 AA1771 16 AW1 95705 AW750756 AI81 1008 AI694151 

BE348594 AW971075 A1347950 A1201455 AI073898 AA652680 AA613671 A131B354 AA507550 AA693692 
AI032599AA991871 AI269801 AW948974T74639 AA532907 AW949173 
330786 53973.3 BE379594 Al 192455 AL039862 AI744012 AI761735 AW243181 AI743687 Al 928223 A1423022 AI627855 

35 AI636059A1651571 AW802044 AI826995 AI431733 AI539125 AAB63056 AW27091 0 AI768930 AW008835 

AW615183 AW591147 A1695294 AI672106 AA506358 AI308060 AA01 1556 AA962437AI935488 BE219625 
AI004356 AW151394 AI218466 N66178 AW 19784 AW242519 AW946907 D60374 AA989263 AI69S799 
AA470460 A1824167 

332247 372869 1 AA669097 AA513815 AA026798 AA576526 AA704429 AA704269 AW1 18292 AA578216 N58172 

40 332396 20265? AW579842BE156562BE156690BE156489BE081033AK00^ 

R17370 A1908947 AA382932 R58449 H18732 AA371231 AW9S2899 AA713530 AW892946 R53463 H11063 
AW068542 Z40761 BE176212BE176155W23952 W92188 AW374883AA303497AW954769AA036808 
BE168063 AW382073 AW382085 AL041475 H80748 AI078161 BE463983 AI805213AI761264W94885 
N945Q2 AI623772 AI419532 AI810302 AI634190 AW0Q2516 AW150777 AI352312 AI367474 AW2O4807 

45 AI6755Q2 AJ337Q28 AW1 34715 BE328451 AI123157 AI560020 AI300745 AI608631 A1248873 AA742484 

AW051635 H1B646 AI245045 AA507111 A164O510 AI925594 AA1 15747 AA143035 AA151106 
332781 32044 1 AK001764 BE313896 AA380199 AA380151 AA194996 AW1 18089 AA495871 AW975219 AW085598 

A1378909 AW992310 AW992409 AI911857 AA657643 AI804471 AI242589 AI623968 R09556 A1129100 
A1206500 AA680094 AA677784 AI023178 AI27751 9 AA424742 AI240654 AA232846 AI804Z73 AI382376 

50 AA001729 W90790 BE090656 AW295015 AI674596 A1431734 AM20517 AW769185 A1128355 AI192474 

AI820001 AA001929 AA706925 A1076676 AI4991 19 AI200493 AI695919 AI376217 W69195 W69261 
AW305099 W90320 BE048357 AI658856 AA838534 AA233258 AI753393 AA709227 AI674387 AI872616 
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TABLE 3B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in Table 3. For each predicted exon, we have listed the genomic sequence 
source used for prediction. Nucleotide locations of each predicted exon are also listed. 



Pkey: Unique number corresponding to an Eosprobeset 

Ret Sequence source. The 7 digit numbers in the column are Qenbank Identifier (G!) numbers. "Dunham I. at al.* refers to the 

publication entitled The DNA sequence of human chromosome 22." Dunham I. et al, Nature (1999) 402:489-495. 
Strand: Indicates DNA strand from which exerts were predicted. 

NLposition: Indicates nucleotide positions of predicted exons. 



Pkey Ref 

333611 Dunham, I. etaL 

333621 Dunham, total. 

333814 Dunham, I. etal. 

333849 Dunham, I. etal 

333949 Dunham, I. etaL 

333951 Dunham, I. etaL 

333955 Dunham, I. etal 

334150 Dunham, I. etat. 

334297 Dunham, I. etal. 

334443 Dunham, I. etal. 

334444 Dunham, 1. etal. 
334447 Dunham, I. etal. 
334570 Dunham, I. etal 
334777 Dunham, I. etal. 
335179 Dunham,!. etal. 
335581 Dunham, I. etal. 
335586 Dunham, I. etal 

335809 Dunham, I etal 

335810 Dunham, i. etal. 
335822 Dunham, L etal. 
335824 Dunham,!. etal 
335886 Dunham, I. etal. 
336034 Dunham,!. etal 
336441 Dunham, L etal. 
337577 Dunham, I. etal. 
338260 Dunham, L etaL 

332797 Dunham, t. etal 

332798 Dunham, L etal. 

332799 Dunham, L etal 
332933 Dunham, L etal. 
332980 Dunham, I. etal 
332984 Dunham, I. etal 

333168 Dunham, total 

333169 Dunham, I. etal 
333452 Dunham, I etaL 
333456 Dunham,!. etal 
333458 Dunham, I. etaL 
334223 Dunham,!. etaL 
334749 Dunham, I. etal 
334960 Dunham, I. etal 
335293 Dunham,!. etat. 
335550 Dunham, I. etal 
335853 Dunham, I. etal 

336624 Dunham, I. etaL 

336625 Dunham, I. etal. 
336679 Dunham, I. etaL 
338255 Dunham, I. etal. 

338561 Dunham, I. etal. 

338562 Dunham,!. etal 
338759 Dunham,!. etal. 

338763 Dunham, I. etal. 

338764 Dunham, I. etal 



Strand 


NLposition 


Plus 


6548368-6548507 


Plus 


85974148597560 


Plus 


7894165-7894252 


Plus 


80183234018472 


Pius 


85896344589791 


Plus 


8592501-6592637 


Plus 


.8597414-8597560 


Plus 


10529221-10529854 


Pius 


13420934-13421058 


Pius 


1429898M4299056 


Plus 


14306433-14306492 


Plus 


14308764-14308824 


Phis 


14994868-14994943 


Plus 


16259586-16260166 


Plus 


216344(3-21634526 


Pius 


24976198-24976334 


Plus 


24990333-24990497 


Plus 


26310772-26310909 


Plus 


26314767-26314849 


Plus 


26364087-26364196 


Pius 


26376860-26376942 


Pius 


26934235-26934364 


Pius 


29014404-29014590 


Plus 


34187606-34187663 


Plus 


595377-595678 


Plus 


15458919-15459257 


Minus 


216964-216798 


Minus 


232147-231974 


Minus 


232421-232307 


Minus 


2035790-2035681 


Minus 


5136165-5136019 


Minus 


2632606-2632457 


Minus 


3729896-3729788 


Minus 


3730864-3730767 


Minus 


5136165*136019 


Minus 


2631933-2631797 


Minus 


5143942-5143806 


Minus 


12734365-12734269 


Minus 


16090686-16090106 


Minus 


20160968-20160795 


Minus 


22316408-22316275 


Minus 


24668714-2466865B 


Minus 


26614629-26614506 


Minus 


227714-227577 


Minus 


229124-229024 


Minus 


2035790-2035681 


Minus 


15242294-15242231 


Minus 


22311966-22311856 


Minus 


22312594-22312465 


Minus 


26582475-26582199 


Minus 


26628148-26628009 


Minus 


26641232-26641101 
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329980 5091594 

329929 6165201 

330020 6671887 

326816 6552456 

326997 5867660 

327098 6682516 

330211 6013592 

328492 5868455 

329362 5868837 



Minus 1031-1162 

Minus 156410-156553 

Plus 172397-172491 

Plus 198354-198436 

Minus 71389-72147 

Minus 1061684-1062361 

Plus 59158-59215 

Minus 46094-46241 

Minus 65688-68173 
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TABLE 4: shows a preferred subset of the Accession numbers for genes found in Table 3 
which are differentially expressed in prostate tumor tissue compared to normal prostate 
tissue. 



Pksy: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Qenbank accession number 

UntgenelD: Unigene number 

Unigene Title: Unigene gene title 

R1 : Ratio of tumor to normal body tissue 



Pkey 


ExAccn 


UnigenelD Unigene Title 


R1 


100819 


HG40204fT4290Hs2387 Transglutaminase 


105 


102698 


U75272 


Hs.1 687 progastricsin (pepsinogen C) 


10.6 


102889 


X02544 


HsS72 orosomucoid 1 


22.6 


105370 


AA236476 


Hs22791 ESTs; Weakly similar to transmembrane pr 


10.3 


105645 


AA262138 


Hs.11325 ESTs 


14 


106094 


AA419461 


Hs23317 ESTs 


10.9 


109014 


AA1 56790 


HS262036 ESTs 


153 


109562 


F01811 


H&187931 ESTs; Moderately simBar to voltage-gate 


103 


113021 


l£OCXK> 


Hs.129836 WAA1028 protein 


103 


114124 


Z38595 


Hs.125019 ESTs; Highly similar to KIAA0886 protein 


213 


122791 


AA460158 


Hs.129836 KIAA1028 protein 


12>4 


124352 


N21626 


Hs.102406 ESTs 


102 


301042 


A1659131 


Hs.197733 ESTs 


243 


302005 


AI869666 


Hs.123119 ESTs 


36.8 


302410 


NM_004917 


Hs218366 EST duster (not in UnfGene) wfih exon h 


263 


302881 


AA508353 


Hs.105314 reidJdn1(H1) 


783 


303344 


AA255977 


Hs250646 ESTs; Highly similar to ubtquUirvconjug 


193 


303753 


AW503733 


Hs3414 ESTs 


13 


310431 


A142Q227 


Hs.149358 ESTs 


72.9 


311251 


Al 655662 


Hs.197698 ESTs 


413 


311596 


AI682088 


Hs.79375 ESTs 


26/ 


312153 


AA759250 


Hs.118625 cyta^romeb-561 


11 


312521 


AA033609 


H&239884 ESTs 


112 


313876 


AA861697 


Hs.120591 EST cluster (not in Un&ene) 


13/ 


314171 


AI821895 


Ks.193481 ESTs 


29/ 


314907 


AI672225 


H&222886 ESTs 


193 


315051 


AW292425 


Hs.163484 EST 


153 


315052 


AAB76910 


Hs.134427 ESTs 


20 


317548 


AI654187 


Hs.195704 ESTs 


142 


317889 


AW295184 


Hs.129142 ESTs; Weakly simSar to DEOXYRIBONUCLEAS 133 


318428 


AI949409 


Hs.194591 ESTs 


123 


318524 


AW291511 


Hs.159066 ESTs 


253 


319080 


Z45131 


Hs .23023 ESTs 


163 


319763 


AA460775 


Hs.6295 ESTs 


143 


320324 


AF071202 


Hs. 139336 ATP-bindtng cassette; sub-family C (CFTR 


562 


321441 


AW297633 


Hs.116498 ESTs 


14.7 


322303 


W07459 


Hs.157601 EST duster (not in UniQene) 


22 


322782 


AA056060 


H&2Q2577 EST duster (not in UniQene) 


18/ 


322818 


AW043782 


H&293616 ESTs 


10.7 


323287 


AA639902 


HS.104215 ESTs 


24.7 


324603 


AW016378 


Hs282934 ESTs 


242 


324617 


AA508552 


Hs.195839 ESTs 


54 


324658 


AI694767 


Hs.129179 ESTs 


22 


324691 


AI217963 


Hs293341 ESTs; Weakly similar to Proa2(XI) [H.sa 


10.6 


324696 


AA641092 


Hs257339 ESTs 


102 


324718 


AI557019 


Hs.116467 ESTs 


34.4 


330211 




CHJ»_p2gq6013592 


12.6 


330430 


HG2261-HT2352 Hs321 110 Antigen, Prostate Specific. AIL SpGce 


133 


330706 


AA121140 


Hs. 1 77576 ESTs; Moderately similar to kynurenine a 


145 


330762 


AA449677 


Hs.15251 Human DNA sequence from done 437M21 on 185 


330892 


AA149579 


Hs.91202 ESTs 


153 


330949 


H01458 


Hs.142896 ESTs 


103 
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331099 R36671 


Hs.14846 


ESTs 


11.6 


331151 R82331 


Hs.268838 ESTs 


13 


331889 AA431407 


Hs.98802 


Homo sapiens Chromosome 1 6 BAC clone CIT 33.6 


332247 N58172 




ESTs 


14.2 


332396 AA34O504 




ESTs; Weakly simitar to simflarto human 


21.2 


332533 M99487 


H&325825 


folate hydrolase (prostate-specific memb 


38.1 


332697 T94885 


Hs.75725 


carooxypeptidase E 


24.3 


332797 




CH22.FGENES.6_2 


30.8 


332798 




CH22_FGENES.6_5 


66.8 


332799 




CH22J=GENES.6_6 


19.8 


334223 




CH2a.FGENES.360J 


20.3 


336624 




CH22.FGENES.6-3 


43.3 


336625 




CH22_.FGENES.64 


37.9 
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TABLE 4A shows the accession numbers for those primekeys lacking unigenelD's for Table 
4. For each probeset we have listed the gene cluster number from which the oligonucleotides 
were designed Gene clusters were compiled using sequences derived from Genbank ESTs 
5 and mRNAs. These sequences were clustered based on sequence similarity using Clustering 
and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers 
for sequences comprising each cluster are listed in the "Accession" column. 

1 0 Ptey: Unique Eos probeset identifier number 

CAT number Gene duster number 

Accession: Genbank accession numbers 

IS Pkey CAT number Accession 

336624 CH22_4071FG_6J_ 

336625 CH22_4072FG_6_4_ 
330211 C_5jj2 

20 332797 CH22J3FG_6JJJNKJ>K3i.G 

332798 Cffi^_14FGL6_5_UNK_C4G1.G 

332799 CH2^15FGL6_6_UNKJMG1 .G 
334223 CH22_1507FG_360_4_UNK^BVI 

332247 372969 J AA669097 AA513815 AA026798 AA676526 AA704429 AA704269 AW1 18292 AA579216 N58172 

25 332396 20265 J AW579842 BE156562 BE156690 BE156489 BE081 033 AK00 1559 BE149402 M85387 AW36781 1 

AW367798 R17370 AI908947 AA382932 R58449 H18732 AA371231 AW962899 AA713530 AW892946 
R53463 H11063 AW068542 Z40761 BE176212 BE176155 W23952 W92188 AW374883 AA303497 
AW954769 AA036808 BE168063 AW382073 AW382085 AL041475 H80748 A1078161 BE463983 
„ rt AI805213 AI761264 W94885 N94502 A1623772 AI419532 AI810302 A1634190 AW002516 AW150777 

30 AI352312AI367474AVV204807AI675502AI337026AW134715BE328451 A11 23157 AI560020 

AI300745 AI608631 AI248873 AA742484 AW051635 H18646 AJ245045 AA5071 1 1 AI640510 AJ925594 
AA115747AA143035AA151106 
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TABLE 4B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in Table 4. For each predicted exon, we have listed the genomic sequence 
source used for prediction. Nucleotide locations of each predicted exon are also listed. 



Pkey: Unique number corresponding to an Eos probeset 

Ref: Sequence source. The 7 digit numbers In this column are Genbank Identifier (GI) numbers. Dunham L et at." refers to the publication entitled The 

DNA sequence of human chromosome 22." Dunham LetaL, Nature (1999) 402:489-495. 
Strand: Indicates DNA strand from which exons were predicted. 

NLposition: Indicates nucleotide posffions of predicted exons. 



Pkey Ref 

332797 Dunham, LetaL 

332798 Dunham, LetaL 

332799 Dunham, LetaL 
334223 Dunham, LetaL 
336624 Dunham, LetaL 
338625 Dunham, LetaL 
330211 6013592 



Strand NLposition 

Minus 216964-216798 

Minus 232147-231974 

Minus 232421-232307 

Minus 12734365-12734269 

Minus 227714-227577 

Minus 229124-229024 

Plus 59158*9215 
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TABLE 5: 1170 GENES UP-REGULATED IN PROSTATE CANCER COMPARED TO 

NORMAL ADULT TISSUES 

Table 5 shows 1170 genes up-regulated in prostate cancer compared to normal adult tissues. 
These were selected from 59680 probesets on the Affymetrix/Eos Hu03 GeneChip array such 
that the ratio of "average" prostate cancer to "average" normal adult tissues was greater than 
or equal to 3.44. The "average" prostate cancer level was set to the 85 th percentile amongst 73 
prostate cancers. The "average" normal adult tissue level was set to the 85 th percentile 
amongst 162 non-malignant tissues. In order to remove gene-specific background levels of 
non-specific hybridization, the 7.5* percentile value amongst the 162 non-malignant tissues 
was subtracted from both the numerator and the denominator before the ratio was evaluated. 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



Pkey: 




Unique Eos probeset identifier number 




ExAccn: 




Exemplar Accession number, Genbank accession number 




UnigenelD: 


Umgene number 






Unkjsns Title: 


Unfeene gene title 




R1: 




Ratio of tumor to normal tissue 




Pkey 


ExAccn 


UnigenelD 


Unigene Title 


R1 


446057 


AJ420227 


Hs.149358 


ESTs, Weakly similar to A46010 X-Rnked 


86.42 


400302 


N48056 


Hs.1915 


folate hydrolase (prostate-specific memb 


68.46 


414569 


AF109298 


Hs.1 18258 


prostate cancer associated protein 1 


58.36 


417407 


AA923278 


Hs2909G5 


ESTs, Weakly similar to protease [H^apl 


56.16 


431579 


AW971082 


H&222886 


ESTs, Weakly similar to TRHYJIUMAN TRICH 


53.38 


409361 


NN1005982 H&54416 


sine ocuDs homeobox (Drosophila) homolo 


4628 


409731 


AA125985 


HSJ6145 


thymosin, beta, Identified h neuroblast 


4524 


400298 


AA032279 


Hs.61635 


six transmembrane epithelial antigen of 


4348 


420154 


AI093155 


H&95420 


JM27 protein 


41.12 


433466 


AA508353 


Hs.105314 


relaxin 1 (HI) 


3938 


400296 


AA305627 


Hs.139336 


ATP-bindlng cassette, sub-family C (CFTR 


3842 


400292 


AA250737 


Hs.72472 


ESTs 


38.00 


432887 


AJ926047 


Hs.162859 


ESTs 


36.48 


439176 


AI446444 


HS.190394 


ESTs, Weakly simflar to B28096 line-1 or 


3645 


430722 


AW968543 


Hs203270 


ESTs, Weakly simflar to ALU1 JIUMAN ALU S 


3320 


437052 


AAB61697 


Hs.120591 


ESTs 


33JJ2 


418396 


A1765805 


Hs26691 


ESTs 


32^8 


434036 


AJ659131 


Hs.197733 


hypothetical protein MGC2849 


3244 


407709 


AA456135 


Hs.23023 


ESTs 


32.10 


426747 


AA535210 


Hs.171995 


kalOkreln 3, (prostate specific antigen 


3130 


407168 


R45175 




ESTs 


31.72 


440260 


AJ972867 


Hs.7130 


copinetV 


3052 


421513 


X00949 


Hs.105314 


relaxrn 1 (HI) 


30.10 


416370 


M90470 


HS203697 


ESTs, Weakly similar to 138022 hypotheti 


29.68 


407122 


H20276 


Hs21742 


ESTs 


2924 


400287 


S39329 


Hs.181350 


kaJCkrein 2, prostatic 


2820 


432244 


AJ 669973 


Hs200574 


ESTs 


28.74 


451939 


U80456 


Hs27311 


single-minded (Drosophila) homoiog 2 


28.74 


415989 


AJ267700 


Hs.111128 


ESTs 


28.34 


418961 


AW967646 


H&23023 


ESTs 


2754 


425628 


NM.004476 Hs.1915 


folate hydrolase (prostate-specific memb 


2722 


458509 


AA654650 


H&282906 


ESTs 


2724 


448290 


AK002107 


Hs20843 


Homo sapiens cDNA FU11245 fis, clone PL 


27.16 


428336 


AA503115 


Hs.183752 


mlcrosernlnoprotein, beta* 


26.17 


450096 


Al 682088 


HS223368 


holocarboxytase synthetase (biotin-[prop 


25.60 


400299 


X07730 


Hs.171995 


kalCkrein 3, (prostate specific antigen 


24.91 


437571 


AA760894 


Hs.153023 


ESTs 


24.74 


453160 


AJ263307 


HS.146228 


H2B histone family, member L 


24.66 


453096 


AW294631 


Hs.11325 


ESTs 


24.46 


425075 


AA506324 


Hs.1852 


acid phosphatase, prostate 


2423 


407202 


N58172 


Hs.109370 


ESTs 


24.18 
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424846 AU077324 Hs.1832 neuropeptide Y 2357 

453370 AI470523 Hs.162356 ATP-binding cassette, sub-family C(CFTR 23.16 

422805 AA4389B9 Hs.121017 H2A histone (amfly t member A 2252 

444917 R68651 Hs.144997 ESTs 2226 

5 408826 AF216077 Hs.48376 Homo sapiens done HB-2 mRNA sequence 22.02 

413597 AW302885 Hs.1 17183 ESTs 21.76 

426429 X73114 Hs.169849 myosln*lnding protein C, slow-type 2122 

435981 H74319 Hs.188620 ESTs 21.12 

432966 AA650114 ESTs 2157 

10 418848 AI820961 Hs. 183465 ESTs 210)6 

% 405685 20.90 

443271 BE568568 Hs.195704 ESTs 19.98 

418819 AA228776 Hs.191721 ESTs 1944 

420757 X78592 Hs59915 androgen receptor (dihydrotestosterane r 19.72 

IS 418994 AA296520 H&89546 selectm E (endothelial adhesion molecul 1956 

429918 AW873986 Hs.1 19383 ESTs 19.04 

415539 AI733881 Hs.72472 ESTs 1843 

450382 AA397658 H&60257 Homo sapiens cDNA RJ13598 fts, done PL 1834 

418829 AA516531 H&55999 NK homeobox (Drosophila), family 3, A 1828 

20 429984 AL050102 Hs227209 hypothetical protein RJ21 61 7 1752 

443822 AHJ87412 Hs.143811 ESTs, Weakfy similar to 20O4399A chromes 1756 

431676 AI685464 H&292638 gbrtt8Sf04j(1 NCI CGAP_Pr28 Homo sapiens 17.64 

410330 AW023630 Hs.46786 ESTs " 1752 

432441 AW292425 Hs.1 63484 ESTs 1741 

25 452792 AB037765 H&30652 KIAA1344 protein 1759 

445472 AB006631 Hs, 12784 Homo sapiens mRNA for Kf AA0293 gene, par 17.00 

414565 AA502972 Hs.1 83390 hypothetical protein FU13590 1652 

430487 D87742 Hs241552 KIAAQ268 protein 16.72 

431716 D89053 Hs268012 fatty-add-Coenzyrne A figase, long-chain 1650 

30 419536 AA603305 gfcnp12d11.s1 NCI_CGAP_Pr3 Homo sapiens 1650 

439677 R82331 Hs.164599 ESTs 16.46 

449625 NM.014253 Hs23798 odz (odd Oz/ten-m, Drosophila) homolog 1 1652 

408430 S79876 Hs.44926 dipeptidytpeptidase IV (C026, adenosine 1628 

447033 AI357412 Hs.157601 ESTs 16.02 

35 453006 AI362575 Hs.167133 ESTs 15.74 

431474 AL133990 Hs.190642 ESTs 15.70 

420218 AW958037 H&22437 ribosomal protein U 15.64 

408000 L11690 Hs520 bullous pemphigoid antigen 1 (23Q/240K)) 1554 

416208 AW291168 Hs.41295 ESTs, WeaWy similar to MUC2_HUMAN MUCIN 1548 

40 430226 BE245562 H&2551 adrenergic, beta-2-, receptor, surface 1540 

415263 AA948033 Hs.1 30853 ESTs 1556 

432437 W07088 HS293685 ESTs 1526 

428398 AI249368 H&98556 ESTs 1521 

429900 AA460421 H&30875 ESTs 1450 

45 449156 AF103907 Hs.1 71 353 prostate cancer antigen 3 1459 

411096 U80034 Hs58583 mitochondrial intermediate peptidase 1451 

435974. U29690 Hs57744 Homo sapiens beta-1 adrenergic receptor 14.76 

444484 AKDQ2126 Hs.11260 hypothetical protein FU1 1264 14.76 

422728 AW937826 Hs.1 03262 ESTs, WeaWy sirraTar to ZN91.HUMAN ZINC 1450 

50 418601 AA279490 Hs.86368 calmegin 1456 

448999 AF179274 Hs22791 transmembrane protein with EQNke and 1455 

445885 AI734009 Hs.127699 KIAA1 603 protein 1444 

452712 AW838616 gb:RC5-LT0054-14020<H) 13-D01 LT0054 Homo" 1422 

432189 AA527941 gbmh30c04.s1 NClCGAP__Pr3 Homo sapiens 14.12 

55 424565 AW102723 Hs.75295 guanylate cyclase 1, soluble, alpha 3 13.78 

429290 AF203032 Hs.1 98760 neurofilament, heavy polypeptide (200kD) 1357 

419264 AA877104 Hs293872 ESTs, Weakly similar to ALUBJ4UMAN 1340 

416445 AL043004 Hs500678 KIAA01 35 protein 1352 

407275 AI3841B6 gb:qw34h07.x1 NCI CGAP Ut4 Homo sapiens 1324 

60 408369 R38438 Hs.182575 solute carrier family 15 (H+/pepBde tra 1321 

446720 AI439136 Hs.140546 ESTs 13.06 

434988 AI418055 Hs.161160 ESTs 13.02 

448172 N75276 Hs.135904 ESTs 1258 

416182 NMJXM354 Hs.79069 cycfinG2 12.94 

65 420544 AA677577 Hs58732 Homo sapiens Chromosome 16 BAC done CIT 12.79 

445413 AA151342 Hs.12677 CGM47 protein 1254 

452588 AAS89120 Hs.1 10637 homeoboxAlO 1252 

407819 R42185 Hs274803 ESTs 1250 

433444 AW975324 Hs.128816 ESTs 1250 
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414140 AA281279 H&23317 hypothetical protein FU14681 954 

435980 AF274571 H&129142 deoxyribonuclease II beta 954 

421246 AW582962 Hs.300961 CG W7 protein 920 

427304 AA761526 H&163B53 ESTs 9.16 

5 442914 AW188551 Hs.99519 hypothetical protein FU14007 9.16 

413627 BE182082 HSJ246973 ESTs 9.14 

439699 AF086534 Hs.1 87561 ESTs, Moderately stmRar to ALU1_HUMAN A 9.10 

437718 AI927288 Hs.196779 ESTs 9.07 

439820 AL36Q204 HsJ283853 Homo sapiens mRNA full length insert cON 9.06 

10 447342 AI199268 Hs.19322 Homo sapiens, Similar to RIKENcDNA 2010 9.05 

446223 BE300091 Hs.119699 hypothetical protein RJ 12969 9.04 

410001 AB041036 Hs£7771 kaliikrein 1 1 9.03 

424012 AW368377 Hs.137569 tumor protein 63 kDa with strong homdog 9.03 

441791 AW372449 Hs. 175982 hypothetical protein RJ21 159 9.Q2 

15 446206 BE622585 Hs.3731 ESTs, Moderately similar to r38022 hypot 9.02 

414269 AA296489 olfactory receptor, family 51, subfamily 8.99 

442081 AA401863 Hs52380 ESTs 8.98 

420092 AAB14043 H&88045 ESTs 8.85 

411630 U42349 Hs.71119 Putative prostate cancer tumor suppresso 8.80 

20 421863 AI952677 Hs. 108 972 Homo sapiens mRNA; cOMA DKFZp434P228 (fr 8.60 

454141 AW138413 Hs, 182356 ATP-fahding cassette, sub-family C (CFTR 8.60 

418278 A1088439 Hs33937 hypotheticai protein 8.78 

428330 L22524 Hs5258 matrix metalloproteinase 7 (matrilysin, 8.76 

432415 T16971 Hs589014 ESTs, Weakly similar to A43932 mucin 2 p 8.75 

25 424906 AI566086 Hs.1 53716 Homo sapiens mRNA for Hmob33 protein, 3* 8.74 

415245 N59650 HS27252 ESTs 8.72 

442409 BE208843 Hs.1 29544 hypothetical protein MGC1 5438 8.70 

404571 8.66 

418033 W68180 H&259855 elongation factor*2 kinase 8.64 

30 456497 AW967956 Hs.123648 ESTs, Weakly similar to AF1084601 ubinu 8.56 

405876 9^4 

448807 A1571940 Hs.7549 ESTs 852 

445372 N36417 Hs.144828 ESTs 8.48 

425171 AW732240 Hs300615 ESTs 8.44 

35 419968 X04430 Hs.93913 interteukin 6 {interferon, beta 2) 8.36 

407385 AA610150 Hs572072 ESTs, Weakly slmflar to 138022 hypotheti 8.31 

433172 AB037841 Hs.102652 hypothetical protein ASH 1 8.30 

422631 BE218919 Hs.118793 hypothetical protein FU 10688 857 

412719 AW016610 Hs.129911 ESTs 854 

40 418849 AW474547 H&53565 Homo sapiens PIG-M mRNA for mannosytiran 852 

444922 AI921750 Hs.144871 Homo sapiens cONA FU13752 fis, clone PL 852 

427674 NM.003528 H&2178 H2B histone family, member Q 850 

432101 A1918950 Hs.11092 EphA3 8.17 

416268 H51299 gbryp07c06.s1 Scares breast 3NbHBst Homo 8.15 

45 404915 8.08 

440106 AA864968 Hs.127699 K1AA1 603 protein 8.07 

442861 AA243837 Hs57787 ESTs 8.06 

452259 AA317439 Hs-28707 signal sequence receptor, gamma (translo 8.06 

443250 AI041530 Hs.132107 ESTs 8.06 

50 437267 AW511443 HS558110 ESTs 8.04 

452891 N75582 H&2 12875 ESTs, Weakly simHar to DYH9.HUMAN CflJ 8.02 

422219 AW978D73 regulator of mitotic spindle assembly 1 8X0 

453049 BB37217 Hs30343 ESTs " 8.00 

439731 AI953135 Hs.45140 hypothetical protein FU 14084 7.98 

55 408554 AA836381 Hs.7323 nuclear receptor c<Hepressor/HDAC3 comp 7.94 

421154 AA284333 HS587631 Homo sapiens cDNA FU 14269 fis, clone PL 7.94 

430107 AA465293 Hs.105069 ESTs 7.94 

433404 T32982 Hs.1 02720 ESTs 7.93 

450813 A1739625 Hs503376 ESTs 7.90 

60 416239 AL038450 Hs.48948 ESTs 7.E5 

448212 AI475858 Qbic87d07JCl NCI_CGAP_CU-1 Homo sapiens 7.82 

449532 W74653 Hs571593 ESTs, Moderately simHar to A47582 B-cel 7.82 

413930 M86153 Hs.75618 RAB11A, member RAS oncogene family 7.80 

458191 AI420611 Hs.127832 ESTs 7.80 

65 444858 AI199738 Hs508275 ESTs, WeaWy similar to ALUA_HUMAN !!ll 7.78 

457498 AI732230 Hs.1 91737 ESTs 7.78 

407235 D20569 Hs.1 69407 SAC2 (suppressor of actin mutations 2, y 7.76 

433759 AA86D003 Hs.109363 Homo sapiens cONA: FU23603 fis, clone L 7.74 

433805 AA706910 Hs.112742 ESTs 7.74 
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448902 


Z45998 


H&22543 


Homo sapiens mRNA; cDNA DKFZp761M912 (f 


5^14 


459055 


N23235 


Hs30567 


ESTs, Weakly similar to B34087 hypotheti 


5.14 




431318 


AA502700 


HS293147 


ESTs, Moderately similar to A46010X-Cn 
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452953 


A1932884 


H&271741 


ESTs, Weakly similar to A46010 XMinked 
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428572 


AK000S84 


Hs.183887 


hypotfiefJcal protein RJ22104 


5.12 
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434401 


AI864131 


Hs.71119 


Putative prostate cancer tumor suppress© 
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AW163Q45 


Hs.79334 


nuclear factor, interleukin 3 regulated 
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Hs.61635 
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H&98280 


potassium intermediate/small conductance 
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407945 


X69208 
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ATPase, Cu++ transporting, alpha poiypep 
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NMJX01851 


H&154850 


collagen, type IX, alpha 1 
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HA59757 


zinc finger protein 281 
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HS57846 


ESTs 
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gb:ny57g01.s1 NC(_CGAPJ J r18 Homo sapiens 
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H&232234 


ESTs 
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418092 
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Hs.1 06604 


ESTs 
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418576 
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H&289104 


Alu-blnding protein with zinc finger dom 
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413328 


Y15723 


Hs.75295 


guanylate cyclase 1 , soluble, alpha 3 
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protein kinase C binding protein 1 
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432729 
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hypothetical protein RJ2Q285 
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Homo sapiens clone Z3-1 placenta expres 
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439662 


H97552 
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ESTs 
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Hs5B3858 


Homo sapiens mRNA full length insert cON 
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AL049176 


Hs.82223 


chord in-fike 


552 




437814 


AI088192 


Hs.1 35474 


ESTs, Weakly similar to DDX9.HUMAN ATP-0 
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426342 


AF093419 


Ha.169378 


multiple PDZ domain protein 
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15 


429782 


NMJJ05754 


Hs.220689 


Ras-GTPase-actrvating protein SH3-domain 


5.02 


429975 


AI167145 


Hs.1 65538 


ESTs 


5.02 


436209 


AW850417 


H&254Q20 


ESTs, Moderately similar to unnamed prot 


5.02 




438571 


AW020775 


H&56Q22 


ESTs 


5.02 
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AM18204 


Hs541493 


natural killer-tumor recognition sequenc 
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Hs.267705 


tubulin-spectfic chaperone e 
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gb:HSC28F061 normalized infant brain cDN 


5.00 




425465 


118964 


Hs.1904 


protein kinase C, iota 


5.00 




430599 


mm 004855 


Hs547118 


phosphatidyiinositot giycan, class B 


5.00 




450961 


AW978813 


Hs55Q867 


rnetaflothkmein IE (functional) 


5.00 


25 


451386 


ABQ29006 


HS26334 


spastic paraplegia 4 (autosomal dominant 


5.00 


420380 


AA640891 


Hs.1 02406 


ESTs 


4.99 




424947 


R77852 


H&239625 


ESTs, Weakly similar to alternatively sp 


4.99 




442653 


BE269247 


Hs.170226 


gb:601 185486F1 N1H.MGC J Homo sapiens cD 


498 




457211 


AW972555 


Hs.32399 


ESTs, Weakly similar to S51 797 vasodilat 


457 


30 


425851 


NM.001490 


Ha.159642 


giucosaminyl (N-acetyl) transferase 1, c 


4J97 


446279 


AA490770 


Hs.182382 


ESTs 


456 






AI752713 


Hs43845 


ESTs 


4J96 




450218 


R0201B 


Hs.168640 


ankylosis, progressive (mouse) bomolog 


456 




412715 


NM_00Q947 


Hs.74519 


primase, polypeptide 2A (58kD) 


454 


35 


448164 


R61680 


H&56904 


ESTs, Moderately similar to Z1 95_HUMAN Z 


4.94 


420121 


AW958271 


Hs.191534 


ESTs, Weakly similar to ALU INHUMAN ALU S 


454 




421689 


N87820 


Hs.1 06826 


KIAA1696 protein 


453 




445808 


AV655234 


Hs298083 


ESTs, Moderately similar to PC4259 ferri 


452 




416533 


BE244053 


Hs.79362 


retinobiastorna-like 2 0)130) 


452 




418049 


AA211467 


Hs.1 90488 


Homo sapiens, Similar to nuclear locaJiz 


452 


40 


436039 


AWQ23323 


Hs.121070 


ESTs 


452 


432653 


N62096 


H&293185 


ESTs, Weakly similar to JC7328 amino act 


451 




420324 


AF1 63474 


HS56744 


prostate androgen-regulated transcript 1 


451 




403047 








451 




4366% 


AA764852 




ESTs 


4.90 


45 


431117 

Wi 1 it 


AF003522 


H&250500 


delta {Drosophila)-Gte 1 


450 
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IB. 1 ff BUM 


RAN bindina orotein 2 
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428804 


AK000713 


Hs.193736 


hypothetical protein FU20706 
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A33nsn 

UdlflU 


A! 0939 30 


Hs.163440 


Homo sapiens cDNA: RJ21000 fis, clone C 


456 


50 




AA225313 


H&222886 


ESTs, Weakly similar to TRHY.HUMAN TRICH 


456 


432615 


AA557191 


Hs55Q28 


ESTs, Weakly similar to 154374 gene NF2 


456 




412652 


AI801777 


Hs.6774 


ESTs 


456 




439473 


AI2Q2703 


Hs.1 52414 


ESTs 


456 




449071 


NM 005672 


H&22960 


breast carcinoma amplified sequence 2 


456 




450654 


AJ245587 


HS25275 


KruppeMype zinc finger protein 
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418866 


T65754 


Hs.100489 


gb:yc11c07.s1 Stratagene lung (937210) H 
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gb:yq30f05.r1 Scares fetal liver spleen 
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WAA1610 protein 
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ESTs 
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KIAA1 157 protein 
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ESTs 
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arachidonate 15-Bpoxygenase, second typ 
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Homo sapiens cDNA: FU21245 fis, done C 
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d^ydrolipoamide branched chain transacy 
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low density lipoprotein receptor-related 
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sporufin 2, extracellular matrix protein 
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Hs.79691 


UM domain protein 
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AL041465 


HS594038 


golgin-67 
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420235 


AA256756 


Hs.31178 


ESTs 
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HS.147924 


prostate cancer associated protein 5 


450 
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Human EST clone 1228B7 mariner transpose 


4 an 
4.07 




419706 


C04649 


Hs.77899 


tropomyosin 1 (alpha) 


4.O0 




414U0D 


A10O949O 


Ue inoooo 


CCTe 
cols 


4.D0 




A1A97A 
4101/0 


luinen 

U4IU0U 


Ue TQilR 

ns. /y 100 


1 nmfaln Ashman ramtlataH 

uv- 1 praiflin, esuugen reguiaieo 


A CA 
4.04 


45 
*rJ 




AAOonoec 


u» ooono 


nomo saptsns cuna. ru^i i a ns, oone v 


A OA 
4.04 


44o4U/ 


AI4/04oU 


Ue 4 MIC f J 

nS.i7U077 


ESTs 
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hypothetical protein 
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ESTs 
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ESTs 
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ESTs 
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K1AA1547 orotein 
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Hs.194591 


ESTs 
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gGoma-arrtpEfied sequance-41 


430 




403764 
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410659 


AI080175 


HSJ68826 


ESTs 


438 




432383 


AK000144 


HS274449 


Homo sapiens cDNA FU20137 fis, clone CO 


438 


60 


451246 


AW189232 


Hs%39140 


cutaneous T-ceD lymphoma tumor antigen 


438 


433234 


AB040926 


Hs.65366 


KIAA1495 protein 


437 




424983 


AI742434 


Hs.169911 


ESTs 
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437812 


A1582291 


Hs.16846 


ESTs, Weakly similar to 04HUD1 debrisoqu 
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AI082883 


Hs.167593 


hypothetical protein RJ13409; KIAA1711 


435 


65 
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BE005346 


Hs.1 16410 


ESTs 


435 
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AI823987 


Hs.182285 


ESTs 
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408897 


N50204 


H&283709 


lipopolysaccharide specific response-7 p 


434 
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AW023424 


Hs.156520 


ESTs 
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general transcription factor IIH, polype 
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KIAA1265 protein 
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433644 AW342Q26 Hs256112 gb*b75d03jfl NCLCQAP.LH2 Homo sapiens 453 

408321 AW405882 H&44205 cortistafin 453 

439225 AA192669 Hs.45032 ESTs 452 

440348 AW015802 Hs.47023 ESTs 452 

5 446351 AW444551 H&258532 x 001 protein 452 

451212 AW902672 Ks287334 ESTs 452 

430294 AI538226 Hs.135184 guanine nucleotide binding protein 4 452 

435005 U80743 Hs.4316 trinucfeotide repeat containing 12 452 

448072 AJ459306 Hs24908 ESTs 450 

10 403721 450 

451018 AW365599 H&247324 mitochondrial ribosomal protein S14 450 

453070 AK001465 H&31575 SEC63, endoplasmic reticulum translocon 4.49 

417412 X16896 H&82112 interieukin 1 receptor, type I 4.48 

439735 A] 635386 Hs.142846 hypothetical protein 4.48 

IS 435663 AI023707 Hs.134273 ESTs 4.48 

424036 AA770688 H&81946 H2A histone famBy, member L 4.48 

426386 AA748850 Hs.174877 bladder cancer overexpressed protein 4.48 

408622 AAQ56060 Hs2G2577 Homo sapiens cDNA FU12166 fis, done MA 4.47 

444269 AI590346 Hs.146220 ESTs 4.47 

20 430187 AJ799909 Ha.158989 ESTs 4.46 

427761 AA412205 Hs.140996 ESTs 4.46 

430261 AA305127 Hs237225 hypothetical protein HT023 4.46 

444169 AV648170 Hs58756 ESTs 4.44 

430598 AK001764 H&247112 hypothetical protein RJ10902 4.44 

25 412903 BE007967 Hs.155795 ESTs 4.44 

417048 AI088775 Hs55498 geranylgeranyl diphosphate synthase 1 4.44 

442710 AI015631 Hs23210 ESTs 4.44 

457413 AA743462 Hs.165337 ESTs 4.44 

400303 AA242758 Hs.79136 UV-1 protein, estrogen regulated 4.42 

30 443268 AI800271 Hs.129445 hypothetical protein RJ12496 4.42 

438209 AL120659 Hs.6111 ajyi-hydrocarbon receptor nuclear trans! 4.42 

431724 AA514535 Hs283704 ESTs 441 

412280 AW205116 H&272814 hypothetical protein DKFZp434E1723 4.40 

440801 AA906366 Hs. 19 0535 ESTs 4.40 

35 452959 AI933416 Ks.189674 ESTs 4.40 

453861 AIQ26838 K&30120 ESTs, Weakly similar to NUCLHUMAN NUCLE 4.40 

417421 AL138201 H&82120 nuclear receptor subfamily 4, group A, m 4.40 

447270 AC002551 Hs531 general transcription factor IIIC, polyp 458 

433641 AF08G229 gb:Human endogenous retrovirus K done 1 458 

40 447078 AWB85727 Hs501570 ESTs 458 

424242 AA337476 hypothetical protein MGC131Q2 457 

408170 AW204516 HS51835 ESTs 456 

448757 AI366784 Hs.48820 TATA box binding protein (TBP^associate , 456 

420021 AA252848 H&293557 ESTs 456 

45 449694 AI659790 H&253302 ESTs 456 

453867 AI929383 Hs.108196 hypothetical protein DKFZp434N185 456 

458712 A13475Q2 Hs. 173066 hypoftefcal protein RJ20761 456 

417251 AW015242 Hs59488 ESTs, Weakly simBar to YK54JTEAST HYPOT 455 

434423 NW.006769 Hs.3844 UM domain only 4 455 

50 423427 AL137612 H&285848 WAA1454 protein 454 

415715 F30364 ESTs 453 

404561 452 

422969 AA782536 Hs. 122647 N^rtyristoyltraitsferase 2 - 452 

423685 BE350494 Hs.49753 uveal autoanfigen with coiled coll domai 452 

55 443977 AL120986 Hs.150627 ESTs, Weakly similar to I38022 hypothefi 452 

425071 NM.013989 Hs.154424 deiodinase. fodothyronine, type II 452 

431583 AL042613 H&262476 S^denosyimethionine decarboxylase 1 451 

411379 A1816344 Hs.12554 ESTs, WeaJdy similar to NPL4JWMAN NUCLE 450 

421476 AW953B05 Hs218B7 ESTs 450 

60 425178 H16097 Hs.161027 ESTs 450 

439262 AA832333 H&124399 ESTs 450 

442818 AK001741 Hs5739 hypothetical protein FU1 0879 450 

421977 W94197 Hs.110165 ribosomal protein L26 homolog 429 

437114 AA836641 Hs.163085 ESTs 428 

65 420195 N44348 Hs500794 Homo sapiens CDNAFU1 1177 fis, done PL 428 

418330 BE409405 Hs.94722 ESTs 427 

419750 AL079741 Ks.183114 Homo sapiens cDNA RJ14236 fis, done NT 426 

437065 AL036450 Hs.103238 ESTs 426 

455276 BE176479 gb:RC3-HT0585-1 60300-022-609 HT0585 Homo 424 
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416292 AA1 79233 Hs.42390 nasopharyngeal carcinoma susceptibility 424 

423740 Y07701 Hs.132243 aminopeptidase puromycto sensitive 424 

442023 AJ187878 Hs.144549 ESTe 454 

426764 AA732524 Hs.151464 ESTs, Weakly similar to ALUC_HUMAN 423 

5 454058 AI273419 Hs.135146 hypothetical protein RJ 13984 423 

456511 AA282330 Hs.145668 ESTs 422 

448330 AL036449 Hs207163 ESTs 422 
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423065 




Hs 194606 


Homo sanfans. dons M£3C!5406 mRNA. coitid 


3.70 


429340 


N35938 


Hs 199429 


Homo santens mRNA: cDNA DKFZd434M2216 i\ 


3.70 




437777 


AA768098 

rW UUV9U 


Hs.189079 


ESTs 


370 




440351 


AF030933 


Hs.7179 


RAD1 fS nnmhfii hnmnlnn 


3.70 








Hs 134289 


ESTs. Waaktv similar to KIAA1063 omtein 


370 




446965 


BE242873 


Hs.16677 


WD reoaat domain 15 


3.70 




A1CCCWAC 

nlKKSGUD 


na./ooto 


nmtnln h/mctno nhncnhatncA nnn Jtsoont 
piuiciii lyiuanui piiua^iiaiaaoi iiuiiicwjpi 


370 




433852 


AJ 378329 


Hs 126629 


ESTs 


3.70 




•tOOl'K 




Ho. 110640 




3 69 








Ue 1QftfVT7 




3.69 


35 


412628 


/Vol CHVC 


Hs.173902 
no. 1 / vMUt 




3.*69 


«*OI«HO 


AA13971R 


Hs 17RfifLd 


ESTs 


3.69 






AI277652 


HSX4578 


PSTr Wfiaktv similar to 1380?? hvnofhati 


3U58 




414709 


AA704703 


Hs.77031 


Sn? tran<£rinlton {actor 


3.68 




AA7W7 


Q/O 






3.68 


40 


•HOf 10 






3.68 


425217 


AUQ76696 


Hs.155174 


CDCS fflflfl division cvete fi S nomha h 


3.68 






AV6478D8 


K&90424 




3^68 






pc coo ore 
p i. 300030 


He 1K1777 


oiiton/mffo trancfatinn inHiaflnn faMnr 

eURoiyuuG uansiauon uinuuun lacuii 


0 CO 
O.DO 




AOiTiA 
41 I/O* 


AlOlflftOil 
/UO10O&4 


He "UT7AAA 


nomo sapiens cunm rLJfcUOw us, uone r\n 


O.ui 


45 


HCILX. I 


1 1C/AQ 


He 17A/Y17 


tMn HrnnflLI forfait cvnHrnmo 


O.0* 


/4OQ0AA 


AI79flfl7R 


He 991997 


ESTs Weaktv simBar to A475B2 B-call or 


3.68 












q eft 

O.DO 










Hmnn esntono rfona TOf^r^TAflft1C1 mQMA rnnti 
nulJIU SapitNlS CKUlo iwww 1 rVM 19 1 lltniMn oo(|U 


3.66 




«tt/llo 


AWRRfKRO 
AVYOeUOOft 


He 


ESTs 


o cc 
O.00 


50 




A1A/TV>QAft9 
AVVUiS>K££ 


He Q7RAQ 


cots 


O«0D 


^C9QJC 


VQC495 






3.66 




419078 


M93119 


Hs.89584 


inajfinoma-assocsated 1 


3^6 






AinAZROA 


no. IJO0O3 


col S 


365 

O.VKJ 




427144 


X95097 


Hs.2126 


ua^Aflriivfl fniQcfinpI nAnnnfl rfi^flntnr 2 * 


3,65 


55 


447500 


AI381900 


Hs.159212 


ESTs 


3.65 


453127 


AI696671 


Hs.294110 


ESTs 


3X5 




423396 


AJ382555 


Hs.127950 


bromodomaln*contatning 1 


3.65 




419346 


AI830417 




potybiomol 


3X4 




441540 


C01367 


Hs.127128 


ESTs 


3X4 


60 


446501 


AJ302616 


Hs.150819 


ESTs 


3.64 


459527 


AW977556 


Hs.291735 


ESTs, Weakly simBar to 178885 serine/th 


3X3 




446320 


AF126245 


Hs.14791 


acyJ-Coenzyme A dehydrogenase family, me 


3X3 




435706 


W31254 


Hs.7045 


GL004 protein 


3X3 




400110 






3.62 


65 


410313 


R10305 


Hs.185683 


ESTs 


3.62 


414713 


BE465243 


Hs.12664 


ESTs 


3.62 




436279 


AW900372 


Hs.180793 


ESTs, Weakly similar to S65657 aipha-1C- 


3.62 




439818 


AL360137 


Hs.18934 


Homo sapiens mRNA full length insert cON 


3.62 




451797 


AW663858 


HsX6120 


small inducible cytokine subfamSy E, me 


3X2 




451294 


AI457338 


Hs.29894 


ESTs 


3.62 
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434194 


AF1 19847 


He 91*3040 
no.fcOjoHv 


Hnrrm cfln'uanQ PRTHR^A mPKJA nartiol rata 


3.62 




404939 








3.62 




408101 


A W9 68504 


Ue 193073 


vLA^tl dkUUU plUlOUl rJliouXJ / 


3.62 




•KM OHO 


AA700870 


ru». IHOUH 


ESTs 


3.61 


5 




no 1 u/o 


He A71Q1 


CO 19 


3.61 




4279TB 




He 49598 


ESTs 

C9I9 


3.61 




433495 


AW373784 


Hs.71 




3.60 




4rwiq7 








3.60 

O.Uv 


,10 


4041 fifi 








3.60 


400571 




He 1A75fiS 
no. 10/otw 


CO 18 


3.60 




410CR1 

AiOQOA 
4IC9Z4 


RPMA499 


He RQQ4 

U c 7COCQ 


Hnrnn eaninne rOMA* FI .199044 fie rinna H 
H9A hkfnna famHu momhor V 

ntM lustone ramjiy, lusmDer 1 


3.60 

o en 
o.ou 




434228 


Z42047 


HsJ>fi3978 

njitwaio 


nullIU oCLjJJOJIo rnvt/ »J 1 IliniW, vUliipiQlD UUo 


3.60 




436797 


AA731491 




hvnofhftfenl nmtatn MG^14ft7Q 


3.60 


4371 A9 
HO/ lOt 


, MVVUUODUD 


He RAAA 
nS-OHOH 


uiyruiu normono recepior wacuvaurig pr 


o en 
O.DU 




437444 


H4fi00fl 
nfouuo 




CCTe 
CO 1 9 


3.60 












359 




Niftier 
44010/ 


RP97fifl0fl 


Me 141740 


Hnmn carJone rHMA' H I99CA9 fie Nnna U 

numo Sapiens cutvv ixwajcoog lis, uuns n 


■3 CQ 




Ho/ 00/ 


A ICQ, 1009 


He 199491 


Human HK1A eAftiianna (mm einna ODi-^fiT HI 
nuilitiil Uf in octjUoilUo HQITl UUT19 nr 1* 10/ J ■ 1 


O CQ 
OJOO 


4EJ14/ 


A AOG7007 

SVwO/9£/ 


He 141740 
nS. 1 0 I / HU 


Hnmn cantanc rTtMA- CI lOOKRO fie rinna H 
numO 5apiSnS CUIM/v rLJtcOOc ITS, QDflo n 


OC7 
J.9f 




452226 


AAn94AQA 


He 90R009 


ESTs 


o cc 

cjjo 




AAV77R. 
4407/0 


Andy 1004 


He 90/7*59 


matrix rnstaibprotelnass 26 


r> cc 
0.OD 




452501 


MDUO/ #91 


He 9071ft 








AOQGA7 


AAooUUOU 


nS.1 14044 


col S 


0.00 


42Z443 


MM rw/7JV7 

NNL014707 


KS- 1 10700 


Wstona dQaostytass 76 


OJ00 




447966 


AA340o0o 


nS.lOooo7 


to 1 s, weawy similar to nomoiog 01 rat z 


cc 
A55 




420892 


AW87507O 


nS.1725o9 


nuclear phosphoprotein similar to S. car 


0.00 






/UJU04044 


He 9QQft90 


kLHrul&aU DUX \j 1 


q cc 


^0 


HlOHiffl 


Vinson 


Ue QEflOO 

nS-ooUyt 


thyroid hormoitB receptor intBmctor 1 1 


q ca 
0.04 


428949 


AA441100 


nS.lU4/44 


nypouiencai protein ui\r£p4o4juo 1 1 


q ca 

0.04 




444929 


AlfiflCQ/1 

A10O0041 




ESTs 


0.04 




40IMM 




He flOSft 


glioblastorns overaxpressed 


Om>H 




HtHOOa 


RA7A99 


He 9fi714 
nMO/ IH 


KIAA1A31 nmfain 
fwv\iooi piuioin 


qc4 

0^?H 


JJ 


HOOUUt 


MrU40/0U 


Ue97Q0nft 


cycJinTI 


0.00 


VfJCvlQC 
HOOHCO 


nlocOO 


Ue 9141fi 


CCTe 
CO IS 


q 

OJO 




41CC01 

41O0Z1 


AiAvioeno 
AJOHwlCi 


Ue 101 1flQ 


CCTe 
Cola 


q CO 




410374 






RALBP1 associated Eps domain containing 


OOO 




400/80 








OC9 


40 


409770 


AVV4byooo 




nM ILUC.RDnrualLA.19JU II rl K)tU kitR^ C 


q co 


425305 


AAQCQftOC 


Ue 4CEC70 


nurnan aorta zooui rririNA sequence 


q co 




428939 


AVVZOOOOU 




ESTs 


Q CO 








Ue AAROa 

nS.44090 


ESTs 


q CO 




443703 


AVOHOl// 




ESTs 


q co 




457840 


Al QfiniKO 




no mo sapiens i nipannB moDi proxBin ps 


q R9 


AtYiAAA 

4UZ444 








q R9 




409643 


AW450ooo 


HS257359 


CoTS 


4 C4 
OOI 








He ftTQIfl 


eriannetna ivMnrtnhnenhafA Haamtnoea /ienfrt 
aUwiuSiJIQ IHUilup ll Uopi la u) UcalTuriaoo 


^R1 
0-0 I 




432745 


AIQOiOOC 

AUwflUcb 


Ue OCOC/T7 


go Jit/ oiuo Jo NuLvAaAr_rro homo sapiens 


0 CI 




A4A*)V> 

414Z&: 


AL 1001/0 


He 070 


so id Hoi oenyuroganase 


q ci 


430061 


AB037oi7 




KlAAioSo protein 


O Ci 




421491 


H99999 


HS-42735 


ESTs 


4 Cft 

350 




4ZZ394 


AA90AA77 




Sm protein F 


q cn 




434565 


T52172 




ESTs 


350 


55 


438379 


N23018 


Hs/171391 


C-temunal binding protein 2 


350 


439741 


BE379646 


Hs.6904 


Homo sapiens mRNA fufl length insert cON 


ISO 




447311 


R37010 


Hs.33417 


Homo sapiens cDNA: FU22806 fis, done K 


350 




447805 


AW627932 


HS.19614 


gemin4 


350 




454265 


H03556 


HS.300949 


ESTs, Weakty similar to thyroid hormone 


350 


60 


418838 


AW385224 


H&35198 


ectormcMde pyrophosphatase/phosphodi 


350 


448804 


AW512213 


Hs.42500 


ADP-fibosylation tactor-fike 5 


350 




409617 


BE003760 


Hs.55209 


Homo sapiens mRNA; cDNA DKFZp434K0514 (f 


3/49 




434075 


AW003416 


Hs.160604 


ESTs 


3.49 




444190 


AI878918 


Hs.10526 


cysteine and glycine-rich protein 2 


349 


65 


435017 


AA336522 


Hs.12854 


angiotensin II, type 1 receptor-assocfat 


3.48 


423445 


NM.014324 


Hs.128749 


alpha-methy)acyK)oA racemase 


3.48 




420271 


AI954365 


Hs.42892 


ESTs 


3.48 




443684 


AI681307 


Hs.166674 


ESTs 


3.48 




444168 


AW379879 




gbflC1-nT0256-081199-011-f01 HT0256Homo 


346 




446074 


AA079799 


H&29263 


hypothetical protein FU1 1696 


348 
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45258? 


AL1 37407 


Hs£9911 


Homo sapiens mRNA; cDNA DKrZp434M232 [fr 


3.48 


431542 


H63010 


Hs^740 


ESTs 


3.48 


432697 


AW975050 


Hs.293892 


ESTs, Weakly sanuar to AlU4_human ALU 5 


3.48 


435572 


AW975339 


Hs239828 


ESTs, Weakly similar to GAG2_HUMAN retro 


3.47 


407192 


AA609200 




go:ail2e02.sl Soares_tastis_NHT Homo sap 


3.47 


413435 


X51405 


Hs.75360 


carboxypeptidase E 


3.46 


447210 


AFQ35269 


Hs.17752 


pnospnatiaylsenne-speciiic pnospnoitpas 


« AO 


HHfoOO 




nihwOOfr 






425312 


AA354940 


Hs.145958 


ESTs 


3.46 


442007 


AA301116 


Hs.142838 


nucleolar phosphoproteln Nopp34 


3.46 


417455 


AW007066 


Hs.18949 


ESTs, Weakly similar to CA2B_HUMAN COLLA 


3.45 


426931 


NMJXB416 


Hs.2076 


zinc finger protein 7 (KOX 4 ( clone HF.1 


3.45 


408739 


W01556 


H&238797 


ESTs, Moderately similar to 138022 hypot 


3.45 


436024 


AI800041 


Hs.190555 


ESTs 


345 


408418 


AW963897 


Hs.44743 


WAA1435protatn 


3.45 


409151 


AA306105 


Hs.50785 


SEC22, vesicle trafficking protein (S. c 


3.44 


418626 


AW299508 


Hs.135230 


ESTs 


3.44 


420560 


AW207748 


H&59115 


ESTs 


344 


420686 


AI950339 


Hs.40782 


ESTs 


3.44 


428670 


AA436831 


HS.3B049 


ESTs 


3.44 


436754 


AI061288 


Hs.133437 


ESTs 


3.44 


437960 


AI669586 


Hs.222194 


ESTs 


3.44 


452300 


AW628045 


Hs.28896 


Homo sapiens mRNA full length insert cDN 


344 


421887 


AW1 61450 


Hs.109201 


CGW6 protein 


3.44 
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TABLE 5A shows the accession numbers for those primekeys lacking a unigenelD in Tables 
5, 6, and 7. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed Gene clusters were compiled using sequences derived from 
Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity 
using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank 
accession numbers for sequences comprising each cluster are listed in the "Accession" 
column. 



Ptey: 

CATnumben 
Accession: 



Unique Eos probeset identifier number 

Gene cluster number 

Genbank accession numbers 



Ptey 

407596 
408432 
409752 
409770 
411440 
411479 

411624 
412991 
414269 
415123 
415715 
416288 
416289 
417730 
418636 
419346 
419536 
420111 
422219 
424179 
424242 
428002 
429163 
432169 
432340 



433586 
433641 



433687 
433891 
434415 
434565 
434804 
437113 
444168 
448212 
448310 
451746 



CAT number 

1003489J 

1058667.1 

115301J 

1154048 J 

124577.1 

1247077.1 

1252166.1 

134248 1 

143133J 

1523390.1 

1548818.1 

1585983.1 

1586037.1 

1695795.1 

177402.1 

184129J 

185688.1 

190755.1 

213547.1 

236389.1 

237181.1 

2856Q2J 

300543.1 

342819.1 

345248.1 

345469.1 

356839.1 

370470.1 

37186.1 



373061.1 

376239.1 

385931.1 

38898 1 

393481.1 

433234.1 

593829.1 

755099.1 

757918J 



Accession 

R86913 R86901 H25352 R01370 H43764 AW044451 W21298 
AW195262 R27868 AW811262 

AW963990 AA078196 AW749482 AA077468 BE151571 AA376917 
AW499536 AW499553 AW502138 AW499537 AW5Q2136 AW501743 
AW749402 AW749403 Z45743 R80376 AA093358 

AWB48047 AWB48202 AW848631 AW848142 AW848702 AW848121 AW848632 AW848140 AW848571 

AW848009 AWB48067 AW848069 AW848905 AW848214 

BE145964 BE146286 AW854564 

AW949013AA126111 

AA298489AA137165 

D60925 D60828 D80767 

F30364 F36559 T15435 

H51299 H44619 H46391 R86024 H51892 T72744 

W26333 R05358 H44682 

Z44761 R25801 R11926R35604 

AW749855 AA225995 AW75Q208 AW750206 

AI830417AA236612 

AA603305 AA244095 AA244183 

AA255652 AA280911 AW967920 AA262684 

AW978073 AW978072 AA807550 AA306587 

F30712 F35665 AW263888 AI904014 AI904G18 AA336927 AA336502 

AA337476 AW966227 AA450376 AW96Q222 AA381051 

AA418703AM18711 BHJ71915 BE071920BE071912 

AA884766 AW974271 AA592975 AA447312 

AA527941 AI810608 AI620190 AA635266 

AA534222AA632632 T81234 

AA534489 AW97Q240 AW970323 

AA6501 14 AW974148 AA572946 

TB5301 AW517087 AA601C54 BE073959 

AF080229 AF080231 AF080230 AF080232 AF080233 AF080234 BE550633 AI638743 AW614951 BE467547 
AI680833 AI633818 N29986 U87592 U87593 U87590 U87591 S46404 U87587 AA463992 AW2068Q2 AI970376 
AI583718 AI672574 N25695 AW665466 AI818326 AA126128 AI480345 AW013827 AA248638 AI214968 
AA204735 AA207155 AA206262 AA204833 AW003247 AW496808 AI08O48O AI631703 AI651023 AI867418 
AW818140 AA502500 AI206199 AI671282 AI352545 BE5O1O30 AI652535 BE465762 AA206331 AW451866 
AA471088 AA206342 AA204834 AA2061 00 AW021661 AA332922 N66048 AA703396 H92278 AW139734 
H92683 U87589 U87595 H69001 U87594 BE466420 AI624817 BE466611 AI206344 AA574397 AA348354 
AI493192 

AA743991 AA604852 AW272737 
AA613792 AW182329 T05304 AWB58385 
BE177494 AW276909 AA632849 
T52172AF147324 T52248 
AA849530 AA659316 H64973 
AA744693AW750059 
AW379879A1126285H12014 
AI475858AW969013 
AI480316AW847535 
M86178 AI813822 DS6993 
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452560 922216J BE077084 AW139963AWB63127AWB06209 AW806204AW8062Q5AW806206 AWB06211 AW806212 

AW806207 AWB06208 AW806210 AI907497 - 
452712 928309J AW838616 AWB38660BE144343AI914520 AW888910BE1 84854 BE184784 

453773 980699 1 AL133761 AL133767 

5 455276 127254U BE176479 BE176678 BE176357 BE176550 AWB86079 BE176676 BE176615 BE176555 BE178489 BE176610 

BE176362 

455309 1278153J AW894017 AW893956 AW894032 
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TABLE 5B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in Tables 5, 6, and 7. For each predicted exon, we have listed the 
genomic sequence source used for prediction. Nucleotide locations of each predicted exon 
are also listed. 



Pkey: Unique number corresponding to an Eos probeset 

Ret Sequence source. The 7 digit numbers In this column are Gsnbank Identifier (Gl) numbers. "Dunham L et aL" refers to the 

publication entitled The DNA sequence of human chromosome 22." Dunham L et aL, Nature (1999) 402:489-495. 
Strand: Indicates DNA strand from which axons were predicted 

NLpostfion: Indicates nucleotide positions of predicted exons. 



Pkey 


Ref 


Strand 


NLpositlon 


401045 


8117619 


Plus 


90044-90184,91111-91345 


401424 


8176894 


Plus 


24223-24428 


401451 


6634068 


Minus 


119926-121272 


401714 


6715702 


Plus 


96484-96661 


401747 


9789672 


Minus 


118596-118816,119119-119244,118609-11976^ 








131258,131866-131932,132451-132575,133580-134011 


401785 


7249190 


Minus 


165776-165996,166189-166314,166408-166569,167112-167268,167387-167469,168634-1^ 


401819 


7467933 


Minus 


26217-28486 


402408 


9796239 


Minus 


* 110326-110491 


402444 


9796614 


Plus 


28391-28517 


402791 


6137008 


Minus 


5103*51207 


403047 


3540153 


Minus 


59793*59968 


403137 


9211494 


Minus 


92349-92572,92858-93084,93579-93712,93949-94072^4591-94748,95214-95337 


403721 


7528046 


Minus 


156647-157366 


403764 


7717105 


Minus 


118692-118853 


403797 


8099896 


Minus 


123065-125008 


404165 


9926489 


Minus 


69025-69128 


404210 


5006246 


Plus 


169926-170121 


404253 


9367202 


Minus 


55675-56055 


404561 


9795980 


Minus 


69039-70100 


404571 


7249169 


Minus 


112450-112648 


404721 


9856648 


Minus 


173763-174294 


404915 


7341766 


Minus 


100915-101087 


404939 


6862697 


Phis 


175318-175476 


405403 


6850244 


Minus 


37491-37670,4095141031 


405685 


4508129 


Minus 


37956-38097 


405718 


9795467 


Pius 


113080-113266 


405793 


1405887 


Minus 


89197-89453 


405876 


6758747 


Plus 


39694-40031 


405917 


7712162 


Minus 


106829-107213 


406414 


9256407 


Plus 


4959349850 


406554 


7711566 


Plus 


106956-107121 
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TABLE 6:286 GENES ENCODING EXTRACELLULAR OR CELL SURFACE 
PROTEINS UP-REGULATED IN PROSTATE CANCER COMPARED TO 
NORMAL ADULT TISSUES 

Table 6 shows 286 genes up-regulated in prostate cancer compared to normal adult tissues 
5 that are likely to be extracellular or cell-surface proteins. These were selected as for Table 5 
and the predicted protein contained a structural domain that is indicative of extracellular 
localization (e.g. egf, 7tm domains). 



Ptey: 




Unique Eos probeset identifier number 




ExAccn: 




Exemplar Acces 


sion number, Genbank accession number 




UnigenelD: 




Unigene number 




Unigene Title: 


Unigene gene title 




R1: 




Ratio of tumor to normal tissue 




Ptey 


ExAccn 


UnigenelD 


Uningene Title 


R1 


409361 


NNL0G5982 


Hs.54416 


sine oculis homeobox (DrosophOa) homoio 


4828 


409731 


AA125985 


Hs56145 


thymosin, beta, identified in neuroblast 


4524 


400298 


AA032279 


Hs.61635 


six transmembrane epithelial antigen of 


43,43 


420154 


AI093155 


Hs55420 
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pnc^nafiflyiinosfuN giycan, ctass n 


q od 
0X0 


406414 








0.00 




412494 


AL1 33900 


Hs.792 


AUr-nDosyiaiion factor oomain proiem 1 


q ra 

0.04 




418329 


AW247430 


H&84152 


(ystainiortina^ta-syrunase 


q pq 
0.00 




424850 


AA151057 


HS. 153498 


chrornosorna 18 opan reading frame 1 


q 00 




427585 


D31152 


Ks.1 79729 


mOmmui (.m V nUVii 4 /CnkniU jl_.Ii.ilK 

collagen, type x, alpna 1 (ocnrrud metapn 


q 09 


423052 


M28214 


HS. 123072 


RAB3B, member HAS oncogene family 


q 00 




416111 


AA033813 


H&79018 


chromatin assembly tactor 1 , subunit A ( 


0x2 




419423 


nno JOB 


H&90315 


lflAA/W17 nrntnln 

MAA0007 protein 


q on 






AA455889 


Hs. 167279 


rT vcsinger*cvinairung naoo enectui pro 


o.ou 


55 


431499 


NMJJ01514 


HS258561 


general transcription factor IIB 


3.80 


444078 


BE246919 


Hs.10290 


U5 snRNP-specific 40 kDa protein (hPrp8- 


a78 




430291 


AV660345 


H&238126 


CGI-49 protein 


3.76 




431637 


AI879330 


HS265S60 


hypothetical protein RJ10563 


3,74 




440411 


N30256 


Hs.151093 


hypothetical protein DKFZp434Q1415 


3,74 


60 


405917 








3.74 


451230 


BE546208 


H&26090 


hypothetical protein FU20272 


173 




429597 


NMJJ03816 


HS5442 


a cfisintegnn and rnetaDoproteinase doma 


3.73 




415075 


127479 


Hs,77889 


Friedreich ataxia region gene X123 


3.72 




440351 


AF030933 


Hs.7179 


RAD1 (S.pombe) homolog 


3.70 


65 


443603 


BE502601 


H&134289 


ESTs, Weakly similar to KIAA1063 protein 


3.70 


446965 


BE242873 


Hs.16677 


WD repeat domain 15 


3.70 




412350 


AJ659306 


Hs.73826 


protein tyrosine phosphatase, non-recept 


3,70 




433852 


Ai 378329 


Hs.126629 


ESTs 


3.70 




447397 


BE247676 


Hs.18442 


E-1 enzyme 


3.68 




405718 








3.68 
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425217 


AU07669B 


Hs.155174 


CDC5 (ceQ division cycle 5, S. pombe, h 


3.66 


421734 


AI318624 


Hs. 107444 


Homo sapiens cOMA FU20562 lis, done KA 


3.67 


427221 


L15409 


Hs.1 74007 


von HippeKJndau syndrome 


357 


402408 








3.66 


452946 


X95425 


Hs.31092 


EphAS 


3.66 


419078 


M93119 


Hs.89584 


insuBnoma-associated 1 


3.66 


427144 


X95097 


H&2126 


vasoactive intestinal peptide receptor 2 


3.65 


423396 


AI382555 


Hs.127950 


bromodomafn-contalning 1 


3.65 


446320 


AF126245 


Hs.14791 


acyl-Coenzyme A dehydrogenase family, me 


3.63 
3.62 


404939 
403137 








3.60 


437162 


AW005505 


Hs.5464 


thyroid hormone receptor coactivating pr 


3.60 


404210 








339 


443775 


AF291664 

FW taW » 


H&204732 


matrix metattoproteirtase 26 


356 


4525Q1 


AB037791 


Hs£9716 


hypothetical protein FIJI 0980 


3.56 


422443 


NM 014707 


Hs.1 16753 


histone deacetyiase 7B 


355 


420230 


AU034344 


Hsl284186 


forkheadboxCI 


355 


416428 


Y12490 


H&85092 


thyroid hormone receptor infractor 11 


354 


433002 


AF048730 


H&279906 


cyclinTI 


353 


405793 








352 


457940 


AL360159 


H&306517 


Homo sapiens TRIpartite motif protein ps 


352 


402444 








352 


418250 


U29926 


Hi83918 


adenosine monophosphate deaminase (isofo 


351 


414222 


AL135173 


H3.878 


sorbitol dehydrogenase 


351 


422384 


AA224077 


Hs.42438 


Sm protein F 


350 


447805 


AW627932 


Hs.18614 


gemln4 


350 


454265 


H03558 


Hs.300949 


ESTs, Weakly sirdar to thyroid hormone 


350 


423445 


NM.014324 


Ks.128749 


aJpra-methytacyJ-CoA racemase 


3.48 


413435 


X51405 


Hs.75360 


carboxypepfidase E 


3.46 


447210 


AF035269 


Hs.17752 


phosphatidyteerine-specifc phospholipas 


3.46 


426931 


NM_003416 


Hs.2076 


zinc finger protein 7 (KOX 4, clone HF.1 


s.45 


408418 


AW963897 


Hs.44743 


KIAA1435 protein 


3.45 


421887 


AW161450 


Hs.109201 


CGI-86 protein 


2M 
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Table 7: 42 GENES ENCODING SMALL MOLECULE TARGETS UP-REGULATED IN 
PROSTATE CANCER COMPARED TO NORMAL ADULT TISSUES 

5 Table 7 shows 42 genes up-regulated in prostate cancer compared to normal adult tissues that 
are likely to be small molecule targets. These were selected as for Table 5 and the predicted 
protein contained a structural domain that is indicative of a drugable structure (e.g. protease, 
kinase, phosphatase, receptor). The functional domain is indicated for each gene. 

10 Pkey: Unique Eos probeset identifier number 

ExAcca Exemplar Accession number, Genbank accession number 

UnigeneD: Unigene number 

Unigene Title: Unigene gens title 

PSDomain: Protein Structural Domain 

15 R1: Ratio of tumor vs. normal tissue 



Pkey ExAccn UnfgenelD UnfgeneTitie PSDomain R1 

20 426747 AA535210 Hs.1 71 995 kaffikretn 3, (prostate specific antigen trypsin 3130 

400299 X07730 Hs.171995 kalfikrein 3, {prostate specific antigen trypsin 24.91 

420757 X78592 Hs39915 androgen receptor (aftydrotestosterona r Arxirogenjeoep,hormonejec£f-C4 19.72 

408430 S79876 Hs.44926 dipeptiaylpeptidase IV (CD26, adenosine DPPIV_NJerm,Peptidase_S9 1628 

430226 BE245562 H&2551 adrenergic, beta-2*, receptor, surface 7tm.1 15.40 

25 411096 U80034 Hs.68583 mitochondria! intermediate peptidase Peptidase_M3 1431 

440286 U29589 Hs.7138 chonnergic receptor, muscarinic 3 7tm_1 12.04 

420381 D50640 Hs337616 phosphodiesterase 3B, cQMP-inhibited PDEase 11.10 

407021 U52077 gb:Human marinerl transposase gene, comp SET,Transposase_1 11JQ2 

401424 arginase 9.58 

30 410001 AB 041 036 Hs37771 kalfikrein 11 trypsin 9.03 

428330 L22524 Hs2256 matrix metaDoprotekiase 7 (matriiysin, Peptidase_M10 8.76 

424099 AF071202 Hs.1 39336 ATP-binding cassette, sub-family C (CFTR ABC.tran^ABC.membrane 7.64 

419991 AJ000098 Hs.94210 eyes absent (DrosophHa) homolog 1 Hydrolase 720 

431992 NM.002742 HS2891 protein kinase C, mu pkinase,DAG_PE-bInd,PH 6.49 

35 447359 NM 012093 Hs.1 8268 adenylate kinase 5 adenytateWrtase 6.00 

400301 X03835 Hs.1 657 estrogen receptor 1 OesLrecep^C4,hormone_rec 5.76 

421685 AF189723 Hs.1 06778 ATPase, Ca++ transporting, type 2C, memb E1-E2_ATPase ( Hydrolase 557 

444042 NM.004915 Hs.1 0237 ATP-binding cassette, sub-family G (WHIT ABCjran 531 

447752 M73700 Hs.105938 lactotransferrin transfemn,7tm_1 529 

40 407945 X69208 Hs.606 ATPase, Cu++ transporting, alpha polypep E1-E2_ATPase,Hydroiase,HMA 5.08 

403047 trypsin 4.91 

427617 D42063 Hs.1 99179 RAN binding protein 2 Ran_BP1^-RanBP,TPR^roJsomerase 4.88 

422083 NMJM1141 Hs.1 11256 arachidonate lipoxygenase, second typ Bpoxygenase.PLAT 432 

449535 W15267 Hs23672 low density Epoprotein receptor-related MlreceprJbJdLrecepLa,EGF 432 ' 

45 425071 NM.013989 Hs.1 54424 debdinase, lodothyronine, type II T4_deiodinase 432 

423740 Y07701 Hs293007 aminopeptidase puromydn sensitiva Peptidase JA1 424 

424701 NM_005923 Hs.151988 mitogen-acfJvated protein kinase kinase pkinase 421 

424085 NM.002914 Hs.139226 replication factor C (acfivator 1)2 (40 AAA,ViraLhelicase1 420 

417531 NM.003157 Hs.1087 serine/threonine kinase 2 pkinase' 4.12 

50 428695 AI355647 Hs.189999 purinergic receptor (family A group 5) 7tm.1 331 

410011 AB020641 H&57856 PFTAJRE protein kinase 1 pkinase 3.91 

424850 AA151057 Hs.1 53493 chromosome 18 open reading frame 1 IdLrecepLa 332 

412350 A1659306 Hs.73826 protein tyrosine phosphatase, nornecept Yj)hosphatase T Band_41,PDZ 3.70 

447397 BE247676 Hs.18442 E-1 enzyme Hydrolase 3.68 

55 452946 X95425 Hs31092 EphA5 EPH_Q)d,fn3 ( pkinase,SAM 3.66 

427144 X95097 Hs2126 vasoactivo intestinal peptide receptor 2 7tmj2 3.65 

443775 AF291664 Hs204732 matrix msteltaproteinase 26 Pep«dase_M10 336 

457940 AL360159 Hs306517 Homo sapiens TRIparfite motif protein ps SPRY f 7tmJ 332 

418250 U29926 Hs33918 adenosine monophosphate deaminase (isofo ^deaminase 331 

60 413435 X51405 Hs.75360 carboxypeptidase E Zn_carbOpept 3.46 

447210 AF035269 Hs.17752 phosphatidyiserine^pecinc phospholipas lipase 3.46 
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TABLE 8: 136 GENES SIGNIFICANTLY DOWN-REGULATED IN PROSTATE 
CANCER COMPARED TO NORMAL PROSTATE 

Table 8 shows 136 genes significantly down-regulated in prostate cancer compared to normal 
5 prostate . These were selected from 59680 probesets on the Affymetrix/Eos Hu03 GeneChip 
array such that the ratio of "average" normal prostate to "average" prostate cancer tissues was 
greater than or equal to 2. The "average" normal prostate level was set to the mean amongst 4 
normal prostate tissues. The "average" prostate cancer level was set to the 85 th percentile 
amongst 73 tumor samples. In order to remove gene-specific background levels of non- 
10 specific hybridization, the 10 th percentile value amongst all the tissues was subtracted from 
both the numerator and the denominator before the ratio was evaluated. 

Pksy: Unique Eos probeset identifier number 

ExAocn: Exemplar Accession number, Genbank accession number 

IS UnigenelD: Unigene number 

Unigene Title: Unigene gene title 
R1: Ratio of normal prostate to prostate cancer 



20 PteV — u "—"*- w ' 

425932 M81650 Hs.1968 semenogeflnl 57.69 

425545 N98529 Hs.158295 Human mRNA for myosin light chain 3 (MIC 19.70 

426752 X69490 Hs.172004 tifin 1525 

442082 R41823 Hs.7413 ESTs; cafeyntenin-2 10.05 

25 407245 X90568 Hs.172004 tirjn 9.38 

422711 D60641 Hs21739 Homo sapiens mRNA; cDNA DKFZp586M51 8 (f 9.05 

420813 X51501 H&99949 praiactin-inducsd protein 8.18 

411987 AA375975 Hs.183380 'ESTs, Moderately similar to ALU7_HUMAN 7.45 

404567 5.62 

30 416030 H15261 Hs21948 ESTs 551 

444692 AI620617 Hs.148565 ESTs 527 

444573 AW043590 Hs225Q23 ESTs 520 

428068 AW016437 Hs233462 ESTs 5.08 

437440 AA846804 Hs. 123694 ESTs 435 

35 404113 4.75 

452279 AA286844 Hs.61260 hypothetical protein FU 131 64 4.75 

421058 AW297967 Hs.188181 ESTs 4.63 

445592 AV654382 Hs.17947 "ESTs, Weakly similar to K02F3.1 0 [Ceie 453 

405163 M9 

40 405227 4.45 

454059 NMJ»3154Hs37048 staiherin 4/5 

450152 AI138635 Hs22968 ESTs 4.40 

407013 U35637 "gb:Human nebuGn mRNA, partial cds" 4.03 

403612 452 

45 440089 AA864468 Hs.135646 ESTs 4X0 

408988 AL119844 Hs.49476 Homo sapiens clone TUA8 CrWiKhat regi 338 

436726 AA324975 Hs.128993 "ESTs, Weakly sanflar to KIAA0465 protel 335 

459367 BE148877 B gb:CmHru244-11119^0^12HTu244Hom 335 

427318 AF186081 Hs.175783 zinc transporter 332 

50 411762 AW860972 -gb:QV(H:Trj387-18ra0(>-167^u7CT0387Hom 335 

418668 AW407987 Hs37150 Human clone A9A2BR11 (CAC)n/(GTG)n repea 3.75 

458311 AF069478 *gb AF069478 Homo sapiens astrocytoma II 351 

403649 350 

419682 H13139 Hs.92282 pairecHike tomeodomain transcription fa 358 

55 412519 AA196241 Hs.73980 troponin T1, skeletal, slow - 351 

414206 AW276887 Hs.46609 ESTs 3.45 

427419 NM.000200HS.177888 histatin3 357 

420777 AA280223 Hs.130865 ESTs 355 

428134 AA421773 Hs.161008 ESTs 351 

60 450218 R02018 Hs.168640 'Ank, mouse, homolog of 350 

433474 AI192195 Hs.147174 "EST, Highly similar to ubiouinTvprotei 350 

41B833 AW974899 Hs292776 ESTs 326 

400440 XB3957 Hs53870 nebufin 3.16 
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413778 AAD9Q235 Hs.75535 Nnyosin, light polypeptide 2, regulatory 3.06 

423151 AW838068 , gb.W3lT0048-0103lX>-109-f^ LT0048 Horn 3.05 

445060 AA830811 H&88808 ESTs 2.98 

457065 AI476318 Hs.192480 ESTs 235 

5 432456 H00093 'Qb?h8f12uJ9AW Outward Alu^ranadhn 232 

405678 2.85 

406707 S73840 H&931 "myosin, heavy polypeptide 2, skeletal m 231 

444105 AW189097 Hs.166597 ESTs 2.78 

433968 AL157518 Hs.90421 PR02463 protein 2.73 

10 438522 AAB09431 Hs358886 ESTs 273 

436562 H71937 Hs.169756 "complement component 1, s subcomponent" 2.68 

412417 AA102268 Hs.42175 ESTs 2.67 

455590 BE072259 , gb^V4-BT0536-271299^K9^|04 BT0536 Horn 2.65 

' 415380 F07853 Hs. 16085 putative G-protein coupled receptor 2.65 

IS 428729 AL162331 Hs.191436 hypothetical protein RJ 1061 9 2.64 

408537 AW207734 'gbAJI-H€l2-age-h-01-(HJU1 NCLCQAP.S 2.63 

424706 AA741336 Hs.152108 transcriptional unit N 143 2.63 

413212 BE072092 'gb:PM4-BT0532-16Q200-OQ3-b11 BTO532 Hom 2.63 

406704 M21665 H&929 "myosin, heavy polypeptide 7, cardiac mu 2.62 

20 437507 AA758538 Hs346882 ESTs 2.60 

410384 AI933794 Hs.42745 ESTs 2-58 

408074 R20723 Hs.1247B4 ESTs ZJ5B 

436653 AA829828 Hs392402 ESTs 232 

458090 AI282149 HS56213 "ESTs, Highly simitar to FXD3.HUMAN FORK 2.51 

25 432003 AI689154 Hs.122972 ESTs 230 

436915 AA737400 Hs.142230 ESTs 2.50 

410028 AW576454 H&258553 ESTs 2.46 

448920 AW408009 Hs32580 alkylgrycerone phosphate synthase 2.45 

422046 A1638562 "gb:ts5Da1Q.x1 NCLCGAPJffl Homo sapiens 2.44 

30 451122 AA015767 Hs.193587 ESTs ZM 

422646 H87863 Hs.151380 ESTs 236 

451237 AW600293 "gb:EST00049 pGEM-T {toy Homo sapiens 236 

400001 AFFX ccmUd: BioB-3 2.36 

415835 245365 , gb:HSC2NF061 normalized infant brain cO 236 

35 439706 AW872527 Hs39761 ESTs 2.36 

423341 AW242394 HS352495 ESTs 236 

436486 AA742221 Hs.120633 ESTs 235 

407449 AJ002784 gbiHomo sapiens mRNA; fetal brain cONA 5 233 

430573 AA744550 Hs.136345 ESTs 2.32 

40 401974 231 

443356 AL044498 Hs.133262 'ESTs, WeaWy simflar to PH0217 reverse 231 

430751 NNL012471H&247868 transient receptor potential channel 5 235 

439128 A1949371 Hs.153089 ESTs 235 

448765 R15337 Hs31958 "Homo sapiens cONAFU10532fis, clone N 235 

45 451130 AI762250 H&211347 ESTs 234 

405420 233 

455029 AWB51258 , gb:IL3CT0220-16020(M)664W6 CTQ220 Hom 233 

438224 AA933999 'gb:on91fD4.s1 SoaresJ4FL.T_GBC_S1 Homo 233 

407764 BE008347 "gb«M0-BN0154<)80400-325^T04BN0154Hom 233 

50 413549 BE252470 «gb:601108292F1 MHJ/1GCJ6 Homo sapiens 233 

437010 AA741368 HS391434 ESTs 233 

435111 AI914279 Hs313740 ESTs 232 

403375 231 

455060 AW853441 ty:RC1-CTO252-03010(W23lj09 CT0252 Horn 231 

55 409792 AW854153 , gb:RC3-CTQ254^)604OO-Q29-d03 CT0254 Hom 230 

421154 AA284333 Hs387631 "Homo sapiens cDNA FU14269 lis, clone P 2.19 

401963 2.18 

435034 AF168711 Hs.159397 x 010 protein 2.18 

448996 AW998989 Hs.105749 KWA0553 protein 2.18 

60 436816 AW297599 Hs355667 ESTs 2.17 

442252 AI733395 Hs.129124 ESTs 2.17 

419310 AA236233 Hs.188716 ESTs 2.16 

418579 H91800 Hs.124156 ESTs 2.16 

423315 R54109 H&36096 ESTs 2.16 

65 432744 AA988835 Hs.38664 ESTs 2.15 

424492 AI133482 H&165210 ESTs 2.15 

424770 AA425562 "Qb2w46e05Jl Scares JotaLfetusJ*b2HF8 2.15 

437101 AA744518 Hs.120610 ESTs 2.15 

428793 AC004957 Hs398975 "ESTs, Highly similar to collapsin-2-tik 2.15 
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415708 H56475 "gb.7l87d11.rl Soares_plneal_g!and^N3HPG 2.13 

459619 2.12 

427506 AK000134 Hs.179100 hypothetical protein RJ20127 2.12 

452508 AA804174 Hs.184354 ESTs 2.10 

5 410881 AWB09157 "gb:RC0-STO118-041099-Q31-o07J ST0118 Homo sapiens cONA, mRNA sequence" 2.10 

403087 2.10 

403869 2.10 

445028 D81194 Hs282499 ESTs 2.10 

447884 H295G5 "#):ym60d1O.r1 Soares infant brain 1MB Homo sapiens cDNA clone 5", mRNA sequence* 2.10 

10 414575 H11257 Hs£95233 ESTs 249 

420351 BE218221 Hs.190044 ESTs 248 

426998 BE274360 "gb:601 121068F1 NIH_MGC_20 Homo sapiens cDNA clone 5', mRNA sequence' 2X8 

405455 2X6 

423843 AA332652 "gb:EST36S27 Embryo, 8 week I Homo sapiens cONA 5 1 end similar to similar to 

1 5 monoamine oxidase B, mRNA sequence" 2X8 

406135 2X7 

427046 BE246180 Hs.121385 ESTs 2X7 

403493 2.05 

444514 AI6829Q5 Hs.270431 "ESTs, Weakly similar to ALU1_HUMAN ALU SUBFAMILY J SEQUENCE 

20 CONTAMINATION WARNING ENTRY |H.saptens]" 2X5 

435884 AA701443 Hs.192868 ESTs 2,05 

419629 AB020695 Hs.91662 KIAA0888 protein . 2X3 

405900 2X3 

457350 AW974438 Hs.194138 "ESTs, Moderately similar to AF091457 1 zinc finger protein R1N ZF [RirorvegicusJ" 2X2 

25 400007 AFFX control: BioDn-5 2X1 

406978 M6435B "gb:Human rhom-3 gene, exon.' 2X0 
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TABLE 8A shows the accession numbers for those primekeys lacking a unigeneK) in Table 
8. For each probeset we have listed the gene cluster number from which the oligonucleotides 
were designed. Gene clusters were compiled using sequences derived from Genbank ESTs 
and mRNAs. These sequences were clustered based on sequence similarity using Clustering 
and Alignment Tools PoubleTwist, Oakland California). The Genbank accession numbers 
for sequences comprising each cluster are listed in the "Accession" column. 



Pkey: 

CAT number 



Unique Eos probeset identifier number 
Gene cluster number 
Genbank accession numbers 



Pkey CAT number Accessions 

407764 1014849.1 BE008347 BE008320 BE083307 BE083311 AW075968 

408537 1064753J AW207734D60164D81150D81078D61356AW996804 

409792 1154877J AW8541 53 AW50Q210 BE1 45772 AW501 310 

410881 1225682J AWB09157AW812181 AWB12175 AW812172AW812161 AW812165 

411762 1256908J AW860972 AW862598 AW862599 AW860988 AW860983 AWB6089B AW860925 AWB60322 AW860986 AW860984 AW860989 

413212 1353792J BE072092 BE072106 BE072088 BE072098 BE072103 

413549 1375933.2 BE252470 BE147573 

415708 1548209J H56475 F29401 F34552 

415835 155851 1J 245365 R25905 H05203 T77496 

422046 210744.1 AI638562 T16929 H 13401 F07773 R55836 

423151 225415 J AW838068 AWB37986 AW838067 AA322487 AW837836 

423843 232510 1 AA332652 AA331633 AW999369 AW9G2993 BE170475 AA378845 AW964175 AI475221 

424770 243504.1 AA425562 AI880208 AA346846 N22655 AW81 1 775 AW81 1786 

426998 274259 -1 BE274360 

432456 347718*2 H00093 H00079 H00070 H00054 H00O49 H00063 AW905306 AW905241 AW905410 AW905307 AW90541 1 AW905240 
AW905210 

AW905352 AW905304 AW905239 AW905242 AW905243 H00087 

438224 452656.1 AA933999 AA781181 

447884 740749.1 H29505 R18575 Z43580 T48738 AW35454 BE004683 

451237 863269 J AW60Q293 AI767468 

455029 1249374.1 AW851258 AW851435 AW851106AW851421 

455060 1251259.1 AW853441 BE145228 BE145218 BE145162 BE145283 

455590 1335127.1 BE072259 BE072230 BE007911 

458311 543550.1 AF069478 AF069479 AF069480 
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TABLE 8B shows the genomic positioning for those primekeys lacking unigene DD's and 
accession numbers in table 8. For each predicted exon, we have listed the genomic sequence 
source used for prediction. Nucleotide locations of each predicted exon are also listed. 



Pkey: Unique number corresponding to an Eos probeset 

Ref. Sequence source. The 7 digit numbers in this column are Genbank identifier (Gl) numbers, "Dunham I. et aL" refers to the 

publication entitled The DNA 

sequence of human chromosome 22" Dunham L et ai, Nature (1999) 402:489-495. 
Strand: indicates DNA strand from which axons were predicted. 

NLposition: indicates nucleotide positions of predicted axons. 



Pkey 


Ret 


Strand 


NLposftion 


401963 


3126783 


Pius 


51382-51521 


401974 


3126777 


Plus 


85330-85683 


403087 


8954241 


Plus 


16951 1-169795 


403375 


9255944 


Minus 


92554-92795 


403493 


7341425 


Pius 


157568-159084 


403612 


8469060 


Minus 


94723-94859 


403649 


8705159 


Minus 


27141-27247 


403869 


7280046 


Minus 


34379-34583 


404113 


9588571 


Minus 


13446-13646 


404567 


7249169 


Minus 


101320-101501 


405163 


9966267 


Minus 


161171-161299 


405227 


6731245 


Minus 


22550-22802 


405420 


7211837 


Minus 


13428-13582 


405455 


7656675 


Plus 


134112-134671 


405676 


4079670 


Pius 


151821-152027 


405900 


6758705 


Minus 


71181-71535 


406135 


9164918 


Minus 


65489-85715 
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TABLE 9: 1001 GENES SIGNIFICANTLY UP-REGULATED IN NORMAL PROSTATE 
COMPATED TO PROSTATE CANCER 

Table 9 shows 1001 genes significantly up-regulated in prostate cancer compared to normal 
prostate. These were selected from 59680 probesets on the Affymetrix/Eos Hu03 GeneChip 
array such that the ratio of "average" normal prostate to "average" prostate cancer tissues was 
greater than or equal to 8.14. The "average" normal prostate level was set to the mean 
amongst 4 normal prostate tissues. The "average" prostate cancer level was set to the 85 
percentile amongst 73 tumor samples. In order to remove gene-specific background levels of 
non-specific hybridization, the 10 th percentile value amongst all the tissues was subtracted 
from both the numerator and the denominator before the ratio was evaluated. 



Pkey: 
ExAccn: 
UnigenelD: 
Unigene Title: 
R1: 



Pkey ExAccn UnigenelD Unlgene Title 



451002 AA013299 

435596 AA689465 

443576 AI078027 

434247 AA928116 

400452 AK000165 



AA664330 



10 



15 



20 



25 



30 



35 



40 



45 



50 

405172 
444897 
458019 
405275 
55 457815 



60 



Unique Eos probeset Identifier number 

Exemplar Accession number, Genbank accession number 

Unigene number 

Unigene gBne title 

Ratio o! prostate cancer to normal prostate 



427906 
443685 
451554 
418323 
429480 
426025 
418917 
404407 
442027 
433704 



AI474866 
NM.002118 



AW138330 
X02994 



415354 
424239 
444143 
401672 
430590 
411972 
448992 
408826 
409653 
402964 
422673 



AA608684 

U83527 

F06495 

M67439 

AW747996 

AW383947 

BE074959 

AI766053 

BE540279 

AW451693 

N59027 

AA372275 

R32704 

AW137088 
AW592931 



Hs5018 ESTs, WeaMy similar to ALU3JWMAN ALU S 
H&188999 ESTs 
Hs.169338 ESTs 
H&272065 ESTs 

gb;Homo sapiens cDMA FU20178 fls, clone 

Hs.166520 ESTs 
Hs.174481 ESTs 
Hs.193237 ESTs 

Hs.1 162 major hlstocompattolSty complex, class 
Hs.9295 etasiin (supravalvuiar aortic stenosis, 
Hs233778 ESTs 
Hs.1217 adenosine deaminase 

Hs.128395 ESTs 

Hs.121705 ESTs, Moderately similar to ALUC_HUMAN I 
gb:HSU83527 Human fetal brain (MLovett) 
gfcHSCI AB051 normalized infant brain cDN 

Hs.143526 dopamine receptor D5 

Hs.160999 ESTs 

Hs246381 CD58 antigen 

gbf M0-BTQ582-31 01 00-001 -K» BT0582 Homo 
Hs.188346 ESTs 

gb£01059857F1 NIH_MGC_10 Homo sapiens c 
H&220826 ESTs 

gb:yv59d1 1 .rl Soares fetal liver spleen 
Hs279800 Homo sapiens cDNARJtl 383 fis, done HE 
Hs.301298 ESTs 



407172 



435672 



417016 
438854 



AA703679 

AA339666 

T54095 

AA424163 

AI700148 

AA485224 

AA837098 

AF074994 



Hs.144857 
Hs.256298 
Hs-88500 
Hs.106999 



Hs.156895 

Hs283626 

Hs.57734 

H&269933 

HS24240 



ESTs 
ESTs 

mitogerhactivated protein kinase 8 inter 
ESTs, Weakly similar to SYT5_HUMAN SYNAP 
gb£ST44776 Fetal brain I Homo sapiens c 
gb:ya92c05.s1 Stratagene placenta (93722 
ESTs 
ESTs 

G protein-coupled receptor Wnase-intera 

ESTs 

ESTs 



R1 

1684.00 
738.00 



24520 

222.00 

221.33 

212.00 

16320 

149.45 

126.11 

12327 

120.00 

106.75 

10571 

100.53 

94.00 

89.18 

87.73 

86-82 

8643 

7726 

6847 

68.00 

6126 

57.71 

5640 

54.67 

54.00 

54.00 . 

5256 

5256 

52.32 

51.63 

50.98 

49.60 

4850 

4758 

4653 

4357 

4350 

42,70 

4257 
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406134 4243 

457319 AA480895 Hs^01552 ESTs, WeaWy sirrtlar to T17288 hypotheti 4251 

409314 AA070266 flb2m69d04j1 Stratagene nauroeplthellum 4225 

401124 41.61 

5 429316 AI371157 Hs.178538 ESTs 4050 

420317 AB00662B Hs.96485 KIAA0290 protein 39.64 

457586 AW062439 gbWR0OT0060-120899-001-f08 CT0060Homo 39.60 

417407 AAS23278 Hs290905 ESTs, Weakfy similar to protease [H^api 38.73 

430269 BE221682 Hs. 178364 ESTs 38D6 

10 439602 W79114 Hs58558 ESTs 36.69 

433686 AA604799 Hs.136528 ESTs, Moderately similar to AUU1_HUMAN A 3629 

417993 AW963705 Hs295806 ESTs, WeaWy Similar to ALU7_HUMAN ALU S 35.18 

428214 AA9362B2 Hs.120397 ESTs 35.10 

416908 AA333990 Hs50424 coagulation factor Xtll r A1 polypeptide 36.08 

IS 426264 BE314852 Hs. 168694 hypothetical protein FU 10257 36.00 

415911 H08796 Hs.124952 ESTs 36.00 

457502 AA076049 H&274415 Homo sapiens cONA FU10229 Us, done HE 3523 

421566 NM_000399 Hs.1395 earty growth response 2 (Krox-20 (Drosop 3520 

401468 3459 

20 458561 A1220150 Hs211195 ESTs 34.60 

433601 BE350738 his. 123993 ESTs, WeaJdy simSar to T00366 hypotheti 3324 

454977 AW848032 gb^L3-CT0214-231299<J534)11 CT0214Homo 32.96 

402828 3193 

414522 AW518944 Hs.76325 Homo sapiens eONA: FU23125 fis, done L 31.76 

25 402842 31.68 

421245 AA285383 gb:HTH280 HTCOLl Homo sapiens cONA 573 3159 

401631 F05183 Hs.1799 CD1D antigen, d polypeptide 3126 

408057 AW139565 gb:UI-H-BI1-aea«H)44-Ul.s1 NCLCGAP.Su 3124 

408069 H81795 gb:ys68a10j1 Soares retina N2b4HR Homo 3120 

30 438694 T87479 HS291797 ESTs 3159 

449156 AF103907 Hs.171353 prostate cancer antigen 3 29.78 

428796 AU076734 Hs.193665 solute carrier family 28 (sodiunvcoupled 29.76 

452549 AI907039 gbPM-BT1 34-020499-566 BT134 Homo sapien 2959 

410129 BE244074 Hs285531 regulator of Fas-induced apoptosis 2953 

35 414464 AI870175 Hs.13957 ESTs 2947 

412326 R07566 Hs.73817 Small Inducible cytokine A3 (homologous 2922 

459081 W07808 gbzb03a12.rt SoaresJetaLlung_NbHL19W 2920 

448702 AW102670 Hs.122464 ESTs 29.13 

451939 U80456 Hs27311 single-minded (Drosophlla) homolog 2 28.74 

40 443412 W84893 Hs.9305 angiotensin receptor-like 1 2851 

457324 AB028990 Hs243901 K1AA1067 protein 2824 

424247 X14008 H&234734 lysozyme (renal amyloidosis) 28.18 

457140 A1279960 Hs.178140 ESTs 28.12 

444151 AW972917 Hs.128749 aipha-iTOthylacyK^A racemase 2856 

45 457669 AW104257 Hs.123426 ESTs, WeaWy similar to putafive serine/ 2751 

412429 AV650262 Hs.75765 GR02 oncogene 2756 

405495 2753 

406516 2725 

407997 AW135429 HS243577 ESTs 2656 

50 442115 AW452332 Hs257554 ESTs 2656 

409038 T97490 Hs50002 smali inducible cytokine subfamily A (Cy 2654 

402838 2652 

449846 AI979284 Hs200552 ESTs " 2621 

417153 X57010 Hs51343 collagen, type II, alpha 1 (primary oste 2620 

55 439792 NM-014856 Hs5684 KIAA0476 gene product 2551 

450096 A1682088 H&223368 ESTs 25.60 

424196 AL133660 Hs.142926 Homo sapiens mRNA; cONA DKFZp434M0927 (f 2557 

414246 BE391090 Hs28Q278 EST 2557 

420848 NM_005188 Hs.99980 Cas-Br-W (murine) ecotropic retroviral t 25.48 

60 424778 AA251048 Hs.153042 lyrnphocyte antigen 9 25.42 

409126 AA063426 gbzf70c08.s1 SoaresjDineai_gIand_M3HPG 2525 

443936 AW083491 Hs.31196 ESTs 2522 

419392 W28573 gbSUlO Human retina cONA randomly prim 2551 

411201 T74588 Hs5509 ESTs, WeaWy similar to C03_HUMAN COMPLE 2455 

65 422940 BE077458 gb:RC1-BT0606-09050(H)15-b04 BT0606 Homo 24.76 

437571 AA760894 Hs. 153023 ESTs 24.74 

433973 AI014723 Hs.131770 ESTs 2457 

422416 BE019557 Hs.11900 Human DNA sequence from clone RP4-583P15 2453 

421552 AF026692 Hs. 105700 secreted frizzled-related protein 4 2449 
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443668 U25758 Hs.1 34584 ESTs 2449 

424800 A1035588 Hs.1 53203 My oD family Inhibitor 24.10 

453633 AA357001 H&34045 hypothetical protein FU20764 24.04 

430565 All 22081 H&244343 cadherin related 23 24D0 

5 433694 AI208611 Hs.12068 Homo sapiens cONA FU1 1720 fis, clone HE 2339 

451045 AA215672 gbzr86e09.s1 NCICGAPJSCB1 Homo sapiens 2333 

408583 AW449874 Hs.47359 ESTs 23.73 

444040 AF204231 Hs.1 82982 golglrv67 23.62 

414182 AA136301 gb:zk93g04.s1 Soares_pregnanLutenjs_MbH 23.39 

10 416678 NIUL001327 Hs.167379 cancei/testis antigen 2320 

408380 AF123050 Hs.44532 diublquitin 2238 

456076 BE243877 H&76941 ATPase,NaWK+ transporting, beta 3 poly 22.65 

418299 AA279530 Hs.83368 integrin, beta 2 (antigen CD1 8 (p95), ly 2238 

444917 R68651 Hs.144997 ESTs 2226 

15 444381 BE387335 HS2B3713 ESTs 2238 

415788 AW628686 Hs.78851 KIAA0217 protein 22.04 

410896 AW809637 gbMR4^T0124-261099-015407ST0124Homo 22.00 

412978 A1431708 H&820 homeoboxC6 2135 

458418 AV653846 his. 126261 Homo sapiens Chromosome 16 BAG done CiT 2154 

20 454791 BE071874 gb:RC2*BT0522-12G200-014-a06 BT0522 Homo 21.84 

408748 J05500 Hs.47431 spectrin, beta, erythrocytic (includes s 2126 

416011 H14487 gbryml 8c1 0/1 Scares infant brain 1NIB H 2124 

440474 AJ207938 Hs.7195 gamma-aminobutyric add (GABA) A recepto 21.14 

447047 AI623698 Hs246306 Homo sapiens cDNA: FU23529 Ms, done L 21.11 

25 426793 X89887 Hs.172350 HIR(histone cell cycle regulation defec 21.10 

409841 AW502139 gbAJI-HFBROp-ajr-e-OSO-UlJl NIH_MGC_5 21XT7 

405685 2050 

457359 AI983207 Hs.192481 ESTs, Weakly similar to SYPH_HUMAN SYNAP 2034 

423067 AA321355 Hs285401 ESTs 20,74 

30 422355 AW403724 Hs.1 40 tmmunogtobuSn heavy constant gamma 3 (G 20.73 

401201 20.73 

458278 W28912 Hs.129019 ESTs 20.68 

439097 H66948 gbryr$6d10j1 Scares tetal Ever spleen 20.67 

414875 H42679 Hs.77522 major histocompatibility oorrptex, class 20.66 

35 400928 2036 

451355 NM 004197 Hs.444 serine/threonine kinase 19 2034 

446982 AW500221 Hs.43616 Homo sapiens mRNA for FUQ0029 protein, 2031 

417105 X60992 Hs31226 C06 antigen 2031 

405777 2031 

40 424123 AW966158 Hs38582 Homo sapiens cDNA FU12702 fis, done HI 2020 

425009 X58288 Hs.154151 protein tyrosine phosphatase, receptor t 20.10 

443271 BK68568 Hs.195704 ESTs 1938 

421064 AJ245432 Hs.101382 tumor necrosis factor, aJpha-induced pro 1938 

418819 AA228776 Hs.191721 ESTs 1934 

45 457595 AA584854 gb»o09h11 Jl NCLCGAP^hel Homo sapiens 1930 

404426 1934 

412571 U43143 Hs.74049 fms-relatfid tyrosine kinase 4 19.79 

431457 NM.012211 Hs256297 integrin, alpha 11 19.62 

414002 NM.006732 Hs.75678 FBJniurine osteosarcoma viral oncogene h 1937 

50 418994 AA296520 Hs39546 Selectin E {endothelial adhesion motecui 1936 

437158 AW090188 Hs.4779 K1AA1 150 protein 1932 

4378S6 AA156781 Hs33992 ESTs 19.44 

417421 AL138201 Hs32120 nuclear receptor subfamily 4, group A, m - 1934 

433057 X15675 Hs296832 Human pTR7 mRNA lor repetitive sequence 1922 

55 421730 AW449808 Hs.1 64036 glucosamine (N-acetyi)^-sulfatase (Sanf 1921 

456557 AA284477 Hs.96618 ESTs 18.77 

440806 A1247422 Hs.129966 ESTs 18.76 

439845 AL355743 Hs36663 Homo sapiens EST from done 41214, full 1835 

416155 AI807264 Hs205442 ESTs, Weakly similar to AF11 761 0 1 inner 1834 

60 437820 AA769062 Hs.16029 ESTs, Weakly similar to alternatively sp 18.62 

450923 AW043951 Hs.38449 ESTs 1839 

418329 AW247430 Hs.84152 cystathlonine-beta-synmase 1838 

424537 A1673027 Hs.143271 ESTs 1835 

447742 AF113925 Hs.19405 caspase recruitment domain 4 1832 

65 415251 R42863 Hs.7124 ESTs 1847 

440770 AA912815 H$222076 ESTs 1840 

407711 AI085846 Hs25522 ESTs 1832 

427157 U51166 Hs.1 73824 thymine-DNA glycosylase 1828 

409847 AW501751 Hs279733 ESTs 18.15 
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417240 N5756B 
435732 AF229178 
436896 AW977385 
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429490 
429984 
449214 
433867 
431735 
401515 
444045 
442754 



AI971131 
AL050102 
AI689114 
AKD00596 
AW977724 



432415 
427829 
432516 



414989 
444880 
417651 
453457 
424246 
419078 
417696 
431117 
455254 
425782 
426678 
426403 
425905 
438867 
420940 
459234 
404756 
422247 



ALD45825 

AB001914 

T16971 

A118B225 

R08003 

AA152106 

T81668 

AW1 18683 

R06874 

AL037103 

AW452533 

M93119 



438703 
411424 
402695 
422538 
447108 
448520 



407811 
410721 
437133 
408182 
417315 
431840 
439882 
418277 
410688 
420120 



AW877015 

U66468 

H08170 

NM000361 

AB032959 

AW451157 

AA830664 

AI940425 

U18244 

F09247 

AI076765 

AI803373 

AW845985 

NM.006441 

AW449602 

AB002367 

AW451955 

AW190902 



447033 
421684 
408599 
446012 
409671 
405934 
426108 
416208 
410708 
447342 
454563 
411507 
438170 
416292 



AB018319 

AA047854 

A1080042 

AA534908 

AA847856 

AW135221 

AW796342 

AL04S610 

NM.003816 

AI357412 

BE281591 

AA0558O0 

AV656098 

AA076769 

AA622037 

AW291166 

AA534370 

AI199268 

AW807530 

AW850140 

A016685 

AA179233 



Hs.176028 EST 

Hs.123136 leucine rich repeat and dea&i domain con 
H&278615 ESTs 

Hs276770 CDW52 antigen (CAMPATH-1 antigen) 

Hs293684 ESTs, Weakly similar to alternatively sp 

Hs227209 DKFZP586F1 01 9 protein 

Hs.195663 ESTs 

Hs.3618 htppocaidn-tikel 

H&75968 thymosin, beta 4, X chromosome 

Hs.1 35548 ESTs 
Hs210197 ESTs 

Hs.170414 paired base amino add deaving system 
HS289014 ESTs 
Hs.127462 ESTs 
Hs.188013 ESTs 
Hs.4859 eyefin L ania-6a 

gb:yd29c04.r1 Scares fatal liver spleen 
Hs.154150 ESTs 
H&268628 ESTs 

H&270599 ESTs r Weakiy similar to unnamed protein 

Hs.143604 Kaiso 

Hs.89584 insufinoma-assoctated 1 

Hs.82401 CD69 antigen (p60, early T-ceQ activati 

H&250500 delta (Drosophila>fikB 1 

gb:QV2^0010-2503OW9M12 PT0010 Homo 
H&159525 ceD growth regulatory with EF-handdoma 
Hs.1 13755 ESTs 
Hs2G30 thrombomodulin 
Hs.161700 K1AA1 133 protein 
Hs.181157 ESTs 
Hs.143974 ESTs 

gb.-CMO^CT0052-15079W)24-c04 CT0052 Homo 

Hs.1 13602 solute carrier family 1 (high affinity a 
Hs.1 67399 protocadherinalphaS 
Hs269899 ESTs 
Hs51599 ESTs 

gbflC2-CTO163-2OO999^H08 CT0163 Homo 



Hs.118131 5, 
Hs.217953 ESTs, Moderately similar to NK-TUMORREC 
Hs21355 doubleoortin and CaM kinase-Iike 1 
Hs.153065 ESTs 

Hs.40098 cysteine knot superfamSy 1, BMP antagon 
Hs2730 heterogeneous nuclear rfoortucteoprotein 
Hs.5460 KIAA0776 protein 

gbztt9gQ4.r1 Scares retina N2b4HR Homo 
Hs.180450 ribosomal protein S24 
Hs.2860 POU domam, class 5, tnanscription facto 
Hs.124565 ESTs 
Hs.130812 ESTs 

pj)PM2-UM0027-23020(H)02-h02 UM0027 Homo 
Hs55243 transcription elongation factor A (Slf)- 
Hs2442 a dislntegrin and metalloproteinase doma 
Hs.157601 EST-notinUniGene 
Hs.106768 hypothetical protein RJ10511 
Hs222933 ESTs 

Hs.172382 hypothetical protein RJ20001 

gb:7B02B10 Chromosome 7 Fetal Brain cDNA 

Hs.1 66468 programmed ceO death 5 
Hs.41295 ESTs 

Hs.154088 Homo sapiens cDNA: FU22756 fis, done K 
Hs.19322 ESTs; Weakly similar to !l!l ALU SUBFAMI 

p^^0-ST(X)81-130999^»W02ST0081 Homo 
gb:1L3-CT021 9-261 099-023-D1 1 CT0219 Homo 
Hs.194601 ESTs 

Hs.42390 nasopharyngeal carcinoma susceptibility 
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18.13 
18.12 
18.12 
1750 
17.82 
17.82 
17.75 
17.72 
17.71 
17.67 
1758 
1755 
1754 
17J50 
1750 
1744 
17-36 
1751 
1750 
1757 
1722 
1722 
17.18 
17.14 
17.14 
17.14 
17-12 
17.12 
17J01 
1750 
1658 
1654 
1652 
1651 
1650 
1658 
1650 
16.78 
16-70 
1659 
1658 
1655 
1654 
1652 
1650 
1650 
1640 
1652 
1650 
1628 
1620 
1659 

* 1654 
1654 
1652 
1652 
1554 
1553 
1556 
1555 

t 1554 
1554 
15.48 
1542 
1558 
1557 
1556 
1529 
1526 
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406638 M13861 
446686 AW138043 
434485 AJ 623511 
441188 AW292830 
444172 BE147740 
409521 BE244654 
420748 AA279956 
422583 AA410506 
424240 AB023185 
451118 AI862096 
437495 BE177778 
445467 AI239832 
418305 AW006783 
402812 

436851 AA732480 
400991 

415752 BE314524 
429900 AA460421 
403883 

430315 NM_004293 
451952 AL120173 
424687 J05070 
447229 BE617135 
425818 AB021225 
448553 A1638449 
431089 BB041395 
459145 A1903354 
448650 AF055575 
400952 

445685 AI734009 
407938 AA905097 
431676 AI685464 
437210 AA311443 
451900 AB023199 
445800 AA126419 
412368 AW945992 
409055 AW304028 
408763 W57550 
446734 AL049278 
413551 BE242639 
421913 AI934365 
452712 AW838616 
451468 AW503398 
406038 Y14443 
424909 S78187 
434078 AW880709 
415254 A1815831 
418196 AI745649 
410020 T86315 
411352 NM_002890 
429848 AF145439 
413729 BE159999 
400125 

420319 AW406289 
448272 AI479094 
422695 AA315158 
424565 AW102723 
458048 H30340 
408694 AI935400 
454093 AW860156 
410889 X91662 
457751 A1908236 
455131 AW857913 
408364 AW015238 
425907 AA365752 
402359 
401044 

409877 AW5Q2498 
423690 AA329648 



Hs.156307 

Hs.1 18567 

Hs255609 

Hs.104558 

Hs.159578 

Hs38672 

Hs.1 18578 

Hs.143535 

Hs.60540 

Hs.15617 



gbfluman T-celt receptor active beta-cha 

ESTs 

ESTs 

ESTs 

ESTs 

Homo sapiens mRNA for FUO0O2O protein, 
ESTs 

Haptens mRNA for ribosomai protein L18 
caJdum/calmcxIulirHlependBrit protein kin 
ESTs 

gb:RC1-HT0598-31030(H)12-f07 HT0598 Homo 
ESTs, Weakfy similar to ALU4.HUMAN ALU S 
ESTs 



HS293581 ESTs 

Hs.78776 Human putative transmembrane protein (nm 
Hs.30875 ESTs 

H&239147 guanine deaminase 
Hs301663 ESTs 

Hs.151738 matrix metaOoproteinase 9 (geratinase B 

gb£01441677F1 NIHJ/IGC.65 Homo sapiens c 
Hs.159581 matrix mstalioproteinase 17 (membrane-tn 
Hs.173031 ESTs 

Hs283676 ESTs, Weakly similar to unknown protein 

gb:ROBT029-100199-1 17 BT029 Homo sapien 
Hs297647 ESTs, Moderately similar to calcium chan 



Hs.127699 

Hs.85050 

Hs292638 

H&293563 

Hs272U7 

Hs301632 

Hs.181125 

Hs300578 

H&301526 

Hs.16074 

Hs.75425 

Hs.109439 

H&210047 

H&88219 

Hs.153752 

HS283683 

Hs.184378 

H&26549 

Hs.728 

Hs.758 

HS225946 



EST duster (not in UniQene) 



Hs.170786 

Hs.75295 
Hs.173705 
H&217286 

Hs36744 



Hs.128453 
Hs.155965 



ESTs 

Homo sapiens mRNA; cDNA DKFZp586E2317 (f 

K1AA0982 protein 

ESTs 

immunoglobulin lambda locus 
ESTs 

Homo sapiens cDNA FU13181 fts, done NT 
Homo sapiens mRNA; cDNA DKFZp564!153 (fr 
ufoiqufiin associated protein 
osteogjycin (osteoinductive (actor, mime 
p>:RC5-LT0054-1402Q0413-D01 LT0054Homo 
ESTs 

zinc finger protein 200 
ceil tfvision cycle 25B 
EST 
ESTs 

ESTs, Weakly similar to T00066 hypofoe ti 
ribonudaase, RNase A famSy, 2 (Over, 
HAS p21 protein activator (GTPase acfiva 
chemoWne (C-C motif) receptor 9 
gb:QV1-HT0412-270300-123Kt10 HT0412 Homo 

hypothetical protein 
ESTs 

gb:EST186956 HOC ceD line (matastasis t 
guanyiate cyclase 1, soluble, alpha 3 
Homo sapiens cDNA: RJ22050 fls, done H 
ESTs 

gbflaKTT0379-29010W)32-b04 CT0379 Homo 
twist (DrosophJIa) homolog (acrocephaios 
gb:lL-BT166-180399-010 BT166 Homo sapien 
gb:RCO-CT0323-231 199-031 -b05 CT0323 Homo 
ESTs 
ESTs 



1526 
1525 
1524 
1522 
1522 
15.16 
15.14 
15.14 
16.12 
15.12 
15.12 
15.06 
15.03 
15.02 
15.00 
15.00 
14.96 
1450 
1434 
1430 
14.72 
14.69 
1437 
14.65 
14.63 
1430 
1435 
1434 
1446 
1444 
1442 
1440 
1436 
1436 
1432 
1431 
1423 
1422 
1422 
1422 
1422 
1422 
14.16 
14.14 
1437 
1437 
1435 
1432 
1338 
1338 
1335 
1330 
1338 
1335 
1330 
1330 
13.78 
13.78 
13.76 
13.75 
13.74 
13.72 



Hs.1 571 50 ESTs, VMeaWy similar to zinc finger prot 
HS23804 ESTs 



1337 
1332 
1330 
1333 
1333 
1349 
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30 



35 



40 



427337 
434318 
435193 
414756 
420626 
420052 
414020 
403851 
422647 



430685 A169Q234 
414052 AW578B49 
447858 AW080339 
435716 A1573283 
439120 H56389 
* 402788 

451591 AA886446 
405411 

426558 AW188574 
453506 AA132818 
416445 ALO43004 
457084 AI074149 



Z46223 

AW207552 

N41359 

AW451101 

AF043722 

AA418850 

NMJJ02984 

W07492 

AI762836 

AB033113 

R2196B 

BE386844 

AI796320 

AA278362 

BE262802 

NWL001621 

AA155859 

BE387790 

T99719 

AW964806 

AI660552 

H20276 

AL137466 

N75276 

AA032197 

BE267154 

NM.004354 

AA015879 

AW903830 

AW161319 

D63480 

NM.001259 

AA534163 

H41324 

D63216 

AU076649 

AA587775 

BE077084 

NM 000878 

BE167229 

BE265839 

U97018 

W26786 

AU076643 

AW873704 

AI306389 

D83407 

H85157 



10 



15 



20 



25 435063 
439367 
451957 
420569 
447883 
426490 
414789 
451416 
443494 
425878 
431912 
407122 
456491 
448172 
452144 
419953 
416162 
451154 
412257 
m 449784 
45 432695 
454105 
439093 
416098 



Hs,1 91 666 ESTs, Weakly similar to reverse transcri 
Hs283552 ESTs, Weakly similar to unnamed protein 
Hs5119H ESTs 
Hs58458 ESTs 

gb:yt87c03.r1 SoaresjJineaLfliandJI3HPG 

Hs.146278 ESTs 

HS24218 ESTs 

Hs.110407 ESTs,WeaktyslmliartocodedforbyC. 
Hs.300678 Human serine/threonine kinase mRNA, part 
Hs.150905 ESTs, Weakly similar to chondroitin 4-su 

Hs.176663 Fc fragment of IgG, low atfinQy UIb f r 

Hs.116328 ESTs, Weakly similar to dJ1 34E15.1 (H.sa 

H&218107 ESTs 

Hs.159489 ESTs, Moderately similar to hexokinase I 

Hs.99491 RAS guanyl releasing protein 2 (calcium 

Hs.44410 ESTs 

Hs.75703 small inducible cytokine A4 {homologous 



50 



55 



60 



450704 
405856 
412935 
65 402802 



414604 
414664 
452560 
413869 
452359 
435886 
445230 
412226 
446619 
447769 
414478 



Hs.157101 

H&271433 

Hs50187 

Hs£7734 

H&248746 

Hs.10299 

HS589062 

Hs.4909 

Hs.170087 

Hs.79708 

Hs56369 

Hs570404 

Hs.38085 

Hs.154903 

HS51742 

HSJ97277 

Hs.135904 

Hs, 102553 

Hs.125752 

Hs.79069 

H&33536 

Hs.12915 

Hs578634 

Hs58481 

Hs5476 

Hs51581 

Hs.153684 

Hs.76558 

Hs56295 

Hs.75596 
Hs.29206 
Hs.12126 
Hs.12451 

H&313 

Hs.48764 

Hs.76240 

Hs.156007 

Hs.40696 



ESTs 

ESTs, Moderately similar to AUJ2_HUMAN A 
WAA1287 protein 

G protein-coupled receptor kinase- intera 
ESTs 

Homo sapiens cDNA FU 13545 fis, done PL 

Homo sapiens cDNA HJ12334 fis, clone MA 

dickkopf (Xenopus laevis) homolog 3 

aryt hydrocarbon receptor 

ESTs 

ESTs 

Homo sapiens cDNA: FU22389 fis, done H 
ESTs, Weakly similar to putative glycine 
ESTs, Weakly similar to A56154 Abt subst 
ESTs 

Homo sapiens mRNA; cDNA DKFZp434H1322 (f 

ESTs 

ESTs 

ESTs 

cycfinG2 

ESTs 

gb^M4^N1037.25040O1554i04 NN1037 Homo 
ESTs 

WAA0146 protein 

cydin-dependent kinase 6 

serine protease inhibitor, Kazat type, 5 

ESTs, Moderately similar to ST1B.HUMAN S 

frizzied-retated protein 

growth arrest and DNA-damage-inducible 3 

Homo sapiens HSPC311 mRNA,parfaicds 

gb:RC54T06a3-22u20<M)13-C07 BT0603 Homo 

interleukin 2 receptor, beta 

Homo sapiens done 24659 mRNA sequence 



echinoderm microtubute-associated protei 
gb:15d7 Human retina cONA randomly prime 
secreted phosphoprotein 1 (osteopontin, 
ESTs 

adenylate kinase 1 

Down syndrome critical region gene 1-lik 
ESTs 



BE267045 Hs. 75064 tubu [in-specific chape rone c 



419978 
403137 



AA889120 
NMJXM454 



Hs.1 10637 HomeoboxAlO 
Hs.93974 forkheadboxJI 



430226 BE245562 Hs5551 adrenergic, beta-2-, receptor, surface 



13.47 
1346 
13.44 
1344 
1343 
1340 
1340 
1358 
1354 
1353 
1352 
1352 
1352 
1350 
1358 
1358 
1357 
1356 
1355 
1355 
1354 
1351 
1351 
1350 
13.19 
13.17 
13.16 
13.14 
13.07 
1356 
13.05 
13.04 
13.03 
13.02 
1100 
13.00 
1259 
12.98 
1256 
1256 
1254 
1253 
1253 
1252 
1252 
1252 
1250 
1258 
1258 
1258 
1254 
1254 
1250 
1250 
12.78 
12.78 
12.77 
12.76 
12.76 
12.76 
12.68 
1256 
12.66 
1255 
12.62 
1252 
1252 
1250 
1257 
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448076 AJ133123 Hs20196 adenylate cyclase 9 1236 

450462 F07097 Hs30082B Homo sapiens mRNA full length insert cDN 1254 

405236 1252 

409292 AAD71051 gb:zm58e05.s1 Stratagene foroblast (937 1247 

5 421540 AA767669 Hs.10242 ESTs 1247 

425640 AW978731 HS301824 ESTs 1244 

443181 AI039201 HS34548 ESTs 1242 

452436 BE077546 Hs31447 ESTs 1242 

_ 455i§3 AW984111 gb:RCO.HN0OO7-16030(MJ11-K)9 HN0007 Homo 1240 

10 432687 AI926047 Hs.162859 ESTs 1237 

410494 M36564 Hs.64016 protein S (alpha) 1236 

439024 R96696 HS35598 ESTs 1236 

451246 AW189232 HS39140 cutaneous T*ceQ lymphoma tumor antigen 1236 

432892 AL042815 Hs.15995 ESTs 12.35 

15 418982 AI348838 Hs.13073 ESTs 1235 

414516 AI3078Q2 Hs279551 ESTs 1234 

440134 BE410734 ob:601301 61 9F1 NIH_MGC_21 Homo sapiens c 1229 

443873 AL048542 Hs. 16291 ESTs 1228 

401286 1236 

20 454020 AW962845 H&256527 ESTs 1224 

420077 AW512260 Hs.87767 ESTs 1224 

443837 AJ984625 Hs.9884 spindle pole body protein 1224 

407519 X64979 gbiHsaptens mRNA HTPCRX01 for olfactory 1223 

435839 AF249744 Hs25951 Rho guanine nudeotide exchange factor ( 1222 

25 448552 AW973653 Hs20104 hypothetical protein FUQ0G52 1220 

405325 1220 

451009 AA013140 Hs.115707 ESTs 12.18 

423066 Y18264 Hs.120171 ESTs 12.17 

439556 AI623752 Hs.163603 ESTs 12.16 

30 443062 N77999 Hs3963 Homo sapiens mRNA fuD length insert cON 12.15 

445873 AA250970 Hs251946 Homo sapiens cDNA: FU23107 Ms, clone L 12.14 

453542 AW836724 Hs.33190 Homo sapiens mRNA expressed only In plac 12.11 

440106 AA864968 Hs. 127699 ESTs 12.10 

417605 AF006609 Hs32294 regulator of G-protein signalling 3 12.10 

35 440266 U29589 Hs.7138 cholinergic rBceptor t muscarinic 3 12.04 

420061 AW024937 Hs29410 ESTs 12.02 

458727 AI022813 Hs32679 Homo sapiens clone COABP0014 mRNA sequen 11.96 

445407 AI222658 Hs221889 ESTs, Weakly similar to ta costa [Djnela 1135 

418250 U29926 Hs33918 adenosine monophosphate deaminase (isofo 1134 

40 414129 AI99Q287 Hs27079B ESTs 1133 

409799 D11926 Hs.76845 phosphosertne phosphatase-Bce 1132 

438461 AW075485 Hs286049 pltosphoserine aminotransferase 1132 

443912 R37257 Hs.184780 ESTs 1132 

424606 AA343936 gbf ST49786 GaO bladderl Homo sapiens 1130 

45 434217 AW014795 Hs23349 ESTs 1130 

451533 NM.004657 Hs26530 serum deprivation response (phosphaGdyl 1130 

422423 AF283777 Hs. 11 6481 CD72 antigen 1139 

409398 AW386461 gb:PM4-PT0019-12129^004-F02PT0019Horno 1139 

423853 AB011537 Hs.133466 sfit (Drosophila) homobg 1 1132 

50 446180 AI074413 Hs.14220 hypothetical protein FU20450 1130 

414341 D80004 Hs.75909 WAA01 82 protein 1130 

406538 11.79 

433253 AW450502 Hs24218 ESTs - 11.79 

447397 BE247676 Hs.18442 E-1 enzyme 11.78 

55 451684 AF216751 Hs26813 CDA14 11.76 

416862 R23765 Hs23575 ESTs 1174 

425770 NM.014363 Hs.159492 spastic ataxia of Chaitevoix-Saguenay (s 11.72 

428826 AL048842 Hs. 1940 19 attractin 11.72 

433037 NM_014158 HS279938 HSPC067 protein 11.72 

60 447476 BE293466 Hs20880 ESTs 11.72 

452092 6E245374 Hs27842 hypoMcal protein FLJ1 1210 11.72 

412922 M60721 Hs.74870 H23 (Drosoph8a}-Cke homeo box 1 11.72 

401680 NM.005578 Hs.180396 LIM domain-containing prefened transloc 1139 

422576 BE548555 Hs.118554 CGW3protein 11.68 

65 450203 AF097994 Hs301528 L^urenine/aipha-an^oadipatearninotra 1138 

410531 AW752953 gb«Vm0224-261039^g02CT0224Homo 1137 

425917 W28517 Hs.117167 Homo sapiens cDNA: FU23067 fis, done L 11.66 

418693 AI750878 Hs37409 thrombospondin 1 11.64 

400557 1132 
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416188 BE167260 Hs.79070 - v-myc avian myelocytornatosls viral oncog 11-60 

419047 AW952771 Hs.90043 ESTs 11-59 

420441 AI986160 HS58446 ESTs 1159 

400885 11^7 

5 409853 AW502327 gb.-UI-HF^R0p^ca-a-07KRII.r1 NiH_MQC_5 1156 

400802 1156 

434540 NM_01604S Hs5184 TH1 drosophOa homolog 1155 

431449 M55994 Hs256278 tumor necrosis factor receptor superfami 1155 

425928 S55736 . Hs238852 ESTs, Weakly simSar to hypothetical pro 1154 

10 434701 AA460479 Hs.4096 KIAA0742 protein 1153 

434228 242047 Hs283978 ESTs; KIAA0738 gene product 1152 

420729 AW964897 Hs29Q825 ESTs 1152 

428328 AA426080 Hs58489 ESTs 1150 

433887 AW204232 H&279522 ESTs 1150 

IS 414812 X72755 Hs.77367 monokine induced by gamma interferon 11.46 

457718 F18572 Hs22978 ESTs 11.44 

452260 AA453208 H&28726 RAB9, member RAS oncogene family 11.42 

459029 AA131376 K&285203 fibroblast growth factor 12 11.42 

456267 AI127958 Hs53393 cystatinE/M 1159 

20 433285 AW975944 Hs237396 ESTs 1158 

449186 AW291876 Hs.196986 ESTs 1157 

447861 AI434593 Hs.1 64294 ESTs 1157 

456023 R00028 gb:ye70a06.s1 Scares fetal liver spleen 1156 

439444 AI277652 Hs54578 ESTs 1151 

25 401163 1151 

430886 L36149 Hs2481 16 chemoklne (C motif) XC receptor 1 11-28 

450784 AW246803 Hs.47289 ESTs 11-28 

452391 AL044829 Hs29331 carnitine palmitoyltransferase I, muscle 1127 

449625 NM_014253 Hs23796 odz (odd Ozfterwn, Drosophila) homolog 1 1126 

30 456827 AA075687 Hs.147176 epidermal growth factor receptor substra 1124 

439328 W07411 Hs.118212 ESTs, Moderately similar to ALU3.HUMAN A 1124 

432093 K28383 gb:yt52c03.r1 Scares breast 3NbHBst Homo 1124 

407335 AA631047 Hs.158761 Homo sapiens cONA FU 13054 fis, done NT 1123 

442501 AA315267 Hs23128 ESTs 1122 

35 429746 AJ237672 Hs214142 5,10-memylenetetrahydrofolatB reductase 1121 

422858 R35398 gb:yg64g10j1 Scares Infant brain 1 NIB H 1120 

415156 X84908 H&78060 phosphoryiase kinase, beta 1120 

446713 AV660122 Hs282675 ESTs 1120 

452221 C21322 Hs.11577 ESTs 1120 

40 418261 W7B902 HS293297 ESTs 11.17 

433332 AI367347 Hs. 127809 ESTs 11.16 

434539 AW748078 Hs214410 ESTs 11.16 

413471 BE142098 gb:CA/U-HT0137-220999417-dt1 KT0137Homo 11.14 

410037 AB020725 Hs58009 K1AA09 18 protein 11.14 

45 405601 11.13 

458332 AIO0Q341 Hs220491 ESTs 11.12 

427654 AA410183 Hs.137475 ESTs 11.12 

427138 N77624 Hs.173717 phosphatidic acid phosphatase type 2B 11.10 

431475 AI567669 Hs287316 ESTs 11.10 

50 425710 AF030880 Hs.159275 solute carrier family, member 4 11.08 

413748 AW104057 Hs.19193 ESTs 11-07 

409208 Y00093 Hs51077 integrin, alpha X {antigen C011C (p150), 11J07 

457278 W92745 Hs.193324 ESTs * 11X0 

407021 U52077 gbttuman marinerl transposase gene, comp 1102 

55 445701 AF055581 Hs.13131 lymphocyte adaptor protein 1102 

408338 AW867079 gbMRI ^0033-120400002*10 SN0033 Homo 1055 

401030 BE382701 Hs25960 v-myc avian myelocytomatosis viral relat 1055 

437891 AW006969 Hs.6311 hypothetical protein FU20859 1054 

453874 AW591783 Hs56131 collagen, type XIV, alp ha 1 (undutin) 1054 

60 421562 AA530994 Hs.105803 ghrelin precursor 1052 

413431 AW246428 Hs.75355 ubiquitin-oonjugafing enzyme E2N (homolo 1052 

400132 1052 

436420 AA443966 Hs51595 ESTs 1050 

424880 NM.000328 Hs.153614 retinitis pigmentosa GTPase regulator 1058 

65 433264 D85782 Hs5229 cysteine dioxygenase, type I .1058 

429842 AI366213 Hs.1 73422 K1AA1605 protein 1057 

412405 AW948126 gb:RC0-MT0013-280300^31-a12MT0013Homo 1055 

400615 1050 

425018 BE245277 Hs.154196 E4F transcription factor 1 1050 
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65 



456011 


DcZ4352fl 


gb:TCBAP1D1053 Pediatric pre-B cell acut 


10.79 




Bel 76862 


90KW-HT0557-170300K)12-a04 HT0587 Homo 


10.74 


450410 


DCOIOJMO 

dcZ1o4io 


u. nr\i or\n cot* 
HS201802 CdlS 


10.73 


412490 


AWoUo004 


U* ooaocn CCT« 
HS^ooooO to IS 


10.72 


436952 


AW377314 


H&53o4 DKFZP564I052 protem 


10.70 


437743 


Al 383497 


HS.131 81 1 ESTs, Weakly stmBar to ALU 1_HUMAN ALU S 


10.70 


449367 


a Anon 
H40978 


Un 174 itnO PCT« tlnJ*Hit/ilii A | M t| nr A*. Al 1 U III II J A LI A 

rls .271498 cSTs, Moderately similar to alu 1_H uman a 


10.70 


449590 


AA694070 


II. AMBfl? POT« 

HS268835 ESTS 


10.68 


446035 


Mil mccco 

NMJJ06558 


Hs. 13565 Samoo-iJKe pnospnotyrosne prate in, T-ST 


10.68 


426530 


U24578 


Hs.1 70250 complement component 4A 


in ee 
10 AO 


428600 


AW863261 


Hs.15036 ESTs, Hignfy similar to AF161358 1 HSPC0 


10^4 


420090 


A AOOAOOO 

AA220238 


HSJ94S8D nDonuctease P (3oku) 


10.64 


451593 


AF151879 


H&26706 CG 1-121 protein 


10.62 


438693 


A r~n~T rno< 

AF075031 


tin Of¥X>7 CCTm 

HS29327 eSTS 


in fio 
10^2 


459324 


AW080953 


gb3cc28c12jc1 NCI_CGAP_Co18 Homo sapiens 


10j61 


439883 


AL359652 


Hs.171098 Homo sapiens EST from done DKFZp434A041 


1058 


406513 


AA715328 


HSJ291205 to IS 


1057 


407826 


AA1 28423 


Hs.40300 caipatn 3, (p94) 


1057 


419550 


D50918 


Hs5Q998 K1AA0128 protein; septn 2 


1056 


428522 


R10184 


U> IMfim PPT* t»l t.L - At 1 14 I h 11 j ■ II •ill* 

Hs.191987 ESTs, Weakly similar to ALU1_HUMAN ALU S 


1056 


459526 


All 42350 


Hs.145735 EST 


1055 


411448 


AA1 78955 


ii_ mj Ann cpt» 

Hs271439 ESTs 


1054 


410102 


AW248508 


Hs279727 ESTs; 


1052 


406577 






1052 


408405 


AK001332 


Hs.44672 hypothetical protein FU10470 


1051 


428966 


AF059214 


Hs.1 94687 cholesterol 25-nydroxylase 


1050 


400880 






10.48 


415875 


AA894876 


Hs.5687 protein phosphatase 1 B formerly 2C), ma 


10.48 


434715 


BE005346 


II. J i/iiJA POT- 

Hs.1 16410 ESTS 


10.46 


406B51 


AA609784 


Hs.1 80255 major histocompatibility complex, class 


10.44 


413409 


Al 638418 


HS21745 ESTs 


10.44 


418489 


U76421 


Hs55302 adenosine deaminase, RNA-spedfic, B1 (h 


10.44 


419465 


AW500239 


Hs21187 Homo sapiens cDNA: FU23068 Ss, done L 


10.44 


419544 


AI909154 


gb:GV-BT200-01 0499-007 BT200 Homo sapien 


10.44 


432180 


Y18418 


Hs272822 RuvB (E coB homotogHte 1 


1044 


413622 


R08950 


Hs272044 ESTs, WeaJdy similar to ALU1_HUMAN ALUS 


1042 


437446 


AA788946 


Hs.1 6869 EST s, Moderately similar to CA1C RAT COL 


1041 


415701 


NM_003878 


Hs.7861 9 gamma-giutamyl hydrolase (conjugase, fol 


10.41 


443790 


NMJX335Q0 


Hs.9795 acyKtoenzyme A oxidase 2, branched chai 


1040 


458873 


AW1 50717 


Hs296176 STAT induced STAT inhibitor 3 


1058 


415082 


AA160000 


Hs.137398 ESTs 


10.37 


429124 


AW505086 


Hs.196914 minor histoconpatMty antigen HA-t 


1056 


417187 


AB011151 


H3515Q5 KIAA0579 protein 


1054 


426827 


AW067805 


Hs.172665 rnetr^rietBtrahydrofoIate dehydrogenase 


1054 


424280 


NM.000030 


Hs271366 aianine-glyoxyiate ajTWOtransferase homo 


1053 


446099 


T93096 


Hs.17126 ESTs 


1052 


423445 


NM.014324 


Hs.1 28749 aipha-methyiacyWoA racemase 


1051 


409995 


AW960597 


Hs^0164 ESTs 


1050 


432242 


AW022715 


Hs.1 62160 ESTs, Weakly similar to ALU4_HUMAN ALU S 


1050 


406394 


AA172106 


Hs.1 10950 Rag C protein 


1050 


406189 




1029 


422283 


AW411307 


Hs.1 14311 CDC45 (ceQ division cyde 45, Sxerevis 


1056 


401598 


AA172106 


Hs.1 10950 Rag C protein 


* 1026 


456995 


789832 


Hs.170278 ESTs 


1026 


416511 


NM.006762 


Hs.79356 Lysosoma^associated multispannmg membr 


1024 


427274 


NMJJ05211 


Hs.174142 colony stimulating factor 1 receptor, fo 


1024 


401384 






1023 


456226 


013168 


Hs.B2002 endotheHn receptor type B 


1022 


426928 


AF037062 


Hs.172914 retinol dehydrogenase 5 {11-cisand 9-cis 


1021 


423032 


AI684746 


Hs.1 19274 ESTs 


1020 


436556 


AI364997 


Hs.7572 ESTs 


1020 


418400 


BE243Q26 


Hs.301989 K1AA0246 protein 


10.19 


437401 


AA757196 


Hs.121190 ESTs 


10.19 


403690 






10.17 


423790 


BE152393 


gbCM2-HT0323-171199-033^08 HT0323 Homo 


10.16 


434094 


AA305599 


Hs238205 hypothetical protein PRO2013 


10.16 


434967 


AW975009 


K3J292274 ESTs 


10.16 


432827 


Z68128 


H&3109 RhoGTPase activating protein 4 


10.16 


432660 


AI288430 


Hs£4004 ESTs 


10.14 
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452234 AW0B4176 Hs.223296 ESTs 10.14 

445629 AI245701 gb^k31f05jd NCI_CGAPJ<id3 Homo sapiens 10.13 

457236 AA626142 Hs.179991 ESTs, Weakly similar to KPC^HUMAN PROTE 10.13 

444605 Al 174603 H&2541Q5 enolase 1 , (alpha) 10.12 

5 450313 AI038989 H&24809 hypothetical protein RJ10826 10.12 

407482 NM_006056 10.12 

449971 AA807346 Hs.288581 Homo sapiens cDNA FU14296 fts, done PL 10.11 

441201 AW118822 Hs.128757 ESTs 10.10 

435157 AW014605 Hs.179872 ESTs 10.10 

10 417308 H60720 Hs.81892 K1AA01 01 gene product 10j09 

442582 AI204266 Hs.179303 ESTs • 10.05 

437252 AI433833 Hs.164159 ESTs, Weakly similar to ALU1 JWMAN ALU S 10.04 

448663 BE614599 Hs.106823 Haptens gene from PAC426I6, similar t 10J04 

434467 BE552368 Hs.231853 Homo sapiens cDNA RJ1 3445 fis, clone PL 10.04 

15 423698 AA329796 Hs.1098 DKFZp434J1813 protein 1O02 

412707 AW206373 Hs.16443 Homo sapiens cDNA: FU21721 fis, clone C 10.00 

414658 X58528 Hs.76781 ATP-binding cassette, sub-family D (AID) IOjOO 

421832 NM.016098 Hs.108725 HSPC040 protein 10.00 

423554 M90516 Hs.1674 gfutamine-frutfcse^hosphate transamin 10.00 

20 452039 AI922988 Hs, 1725 10 ESTs 10.00 

434673 AW137442 Hs. 136965 ESTs 10.00 

4Z7978 AA418280 Hs.180040 Homo sapiens cDNA: FU22439 (is, clone H 10JOO 

457603 BE501815 Hs.198011 ESTs 9.99 

428279 AA425310 Hs.155766 ESTs 9.98 

25 444412 AI147652 Hs£16381 Homo sapiens clone HH409 unknown mRNA 9.98 

417049 N72394 Hs.44862 ESTs 9.96 

427509 M62505 Hs£161 complement component 5 receptor 1 (C5a I 9.96 

445424 AB028945 Hs. 12698 cortactin SH3 domain-binding protein 9.96 

443678 AW009605 Hs231923 ESTs 9.96 

30 447567 AW474513 Hs224397 ESTs, Weakly similar to 848013 proline* 9.94 

414709 AA704703 Hs.77031 Sp2 transcription factor 9.94 

434596 T59538 gb:yb65g12.s1 Stratagene ovary (937217) 9.94 

427630 BE276115 Hs. 144980 ESTs, Weakly similar to CA13_HUMAN COLLA 9.93 

416111 AA033813 Hs.79018 chromatin assembly factor 1 , subunit A ( 9.92 

35 423349 AF010258 Hs.127428 homeoboxA9 9.92 

424308 AW975531 Hs. 154443 minichromosome maintenance deficient (S. 9.92 

416814 AW192307 H&80042 dolk^P-Gid^9Gic^iAc2-PP^or[cnylgl 9.90 

417986 AA481003 H&97128 ESTs 9.90 

425174 D87450 Hs.154978 K1AAD261 protein 9.90 

40 438171 AW976507 H&293515 ESTs 9.90 

421984 AW972187 Hs.110443 hypothetical protein FU22215 9.89 

408597 NM.005291 Hs.46453 G protein-coupled receptor 17 9.88 

413907 AI097570 Hs.71222 ESTs 9.87 

451296 AW801383 Hs.1 18578 Haptens mRNA for ribosomal protein L18 9.86 

45 433409 AI2788Q2 H&25661 ESTs 9.85 

450360 AW117416 H&245484 ESTs 9.85 

433104 AL043002 Hs. 128246 ESTs, Moderately similar to unnamed prot 9.84 

449824 AI962552 H&226765 ESTs 9.84 

452744 AI267652 Hs3Q504 Homo sapiens mRNA; cDNA DKFZp434E082 (fr 9.82 

50 431065 AF026273 Hs^49175 interteukhvl receptor-associated kinase 9.82 

426457 AW894667 Hs. 169965 chimerin (chimaerin) 1 930 

443371 AI792888 Hs. 145489 ESTs 930 
437159 AL050072 gb:Homo sapiens mRNA; cDNA DKFZp566E1346 • 9.75 

425242 013635 Hs. 155287 KIAA0010 gene product 9.74 

55 447498 N67619 Hs.43687 ESTs 9.74 

426759 A1590401 H&21213 ESTs 9.73 

435129 A1381659 Hs.267086 ESTs 9.72 

437672 AW748265 Hs.5741 flavohernoprotein b54b5R 9.72 

438209 AL120659 Hs.6111 KIAA0307 gene product 9.72 

60 438440 AA807228 Hs225161 ESTs 9.72 

449720 AA311152 Hs288708 ESTs; Weakly similar to K1AA0226 [H.sapi 9.72 

414291 AI289619 Hs.13040 ESTs 9.72 

436206 AK001451 H&265561 CD2-associated protein 9.70 

446896 T15767 HS22452 Homo sapiens cDNA: RJ21084 fis, clone C 9.70 

65 412667 AW977540 Hs.269254 ESTs 9.70 

423301 S67580 Hs.1645 cytochrome P450, subfamOy IVA, polypept 9.67 

440757 AW118645 Hs.160004 ESTs 9.67 

441412 AI393657 Hs.159750 ESTs 9.66 

421044 AF061871 Hs.1013Q2 collagen, type XII, alpha 1 9.66 
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414726 BE466863 H&280099 ESTs 9.66 

418435 R91679 Hs.124981 ESTs 9.66 

433480 X02422 Hs.181125 Immunoglobulin lambda locus 9.65 

441530 AI248301 Hs.127112 ESTs 0.65 

5 433533 053304 Hs.65394 ESTs 9.65 

.421470 R27496 Hs.1378 annexinA3 9.64 

438613 C05569 Hs^43122 hypothetical protein FU13057 similar to 9.64 

429324 AA488101 Hs.199245 InacSvafion escape 1 9.62 

450244 AA007534 Hs.125062 ESTs 9.62 

10 407660 AW063190 H&279101 ESTs 9.61 

406554 9.60 

426404 AA377607 Hs£73138 ESTs 958 

447045 AW392394 H&278569 K1AA0064 gene product 9.5B 

449894 AK001578 Hs24129 hypothetical protein FU1071 6 958 

15 448376 AI494332 Hs.196963 ESTs 958 

407902 AL1 17474 Hs.41181 HonrosapierisfnRNA;cONADKFZp727C191 (fr 956 

446572 AV659151 H&282961 ESTs 956 

459245 BE242623 Hs51939 manic fringe (Drosophila) homotog 955 

423545 AP000692 Hs.129781 chromosome 21 open reading frame 5 954 

20 414697 8E266134 Hs.76927 translocase of outer mitochondnaJ msmbr 954 

410846 AWB07057 gb:MR4-ST0062-031199418-b03 ST0062 Homo 952 

421181 NM.005574 Hs.1 84585 UM domain only 2 (rtombotm-fike 1) 952 

427308 D26067 Hs.1 749 05 K1AA0033 protein 952 

415995 NMJD04573 Hs.994 phosphoiipase C, beta 2 951 

25 434846 AW295389 Hs.1 19768 ESTs 951 

414342 AA742181 Hs.75912 Homo sapiens cDNA: RJ22199 fc, clone H 950 

416959 D28459 Hs.80612 ubtqultuvoonjugatmg enzyme E2A (RAD6 h 950 

443123 AA094538 H&6588 ESTs 9.50 

439312 AA833902 Hs270745 ESTs 948 

30 449375 R07114 HS271224 ESTs 9.48 

436357 AJ132085 gb:Homo sapiens mRNA for axonema! dynein 8.44 

458723 AW137726 Hsi 44352 ESTs, Moderately simflar to taminin afph 9.44 

457526 AW450584 Hs.192131 ESTs, WeaWy similar to RIBB (H. sapiens] 9.43 

404741 9.43 

35 422409 NMJ305428 Hs.1 16237 vav 1 oncogene 9.43 

403708 9.42 

408806 AWB47814 H&289005 Homo sapiens cDNA: RJ21532 fis, clone C 942 

417380 T06809 gb£ST04698 Fetal brain, Stratagene (cat 942 

422501 AA354690 Hs.144967 ESTs 942 

40 426197 AA004410 Hs.1 67835 acyK>oenzyme A oxidase 1 , patmitoyi 942 

452624 AU076606 Hs.30054 coagulation factor V (praacceterin, kbi . 9.42 

412110 AW893569 gb^C(W4N0021-O404O(Ml21-c10NN0Q21 Homo 9.41 

414158 AA361623 Hs288775 Homo sapiens cDNA FU13900 fis, clone TH 9.41 

408101 AW968504 Hs.1 23073 CDC2-re!atad protein kinase 7 940 

45 414171 AA360328 H&865 RAP1A, member of RAS oncogene family 9.40 

415947 U04045 Hs.78934 mutS (E coli) homotog 2 (colon cancer, 9.40 

426959 BE262745 gb£01 153869F1 NIH_MGCJ9 Homo sapiens c 9.39 

417519 AI689987 Hs.177669 ESTs, Weakly similar to RMS1_HUMAN REGUL 959 

457181 BE514362 H&296422 FK506-binding protein 3 (25kD) 939 

50 402835 958 

404632 958 

446566 H95741 Hs.17914 Homo sapiens cDNA: FU22801 fis, done K 9.37 
455369 AW903533 gb:CM14M1031-06040(M7B-dQ5 NN1031 Homo * 957 

444001 AI095087 Hs.1 52299 ESTs, Moderately similar to ALU5_HUMAN A 956 

55 458191 AI420611 Hs.127832 ESTs 958 

431374 BE258532 H&251871 CTP synthase 954 

429327 AA283981 Hs.199248 prostaglandin E receptor 4 (subtype EP4) 953 

407061 X97748 gbHsapiens PTX3 gane promoter region. 953 

416967 BB616731 Hs50645 Interferon regulatory factor 1 953 

60 423013 AW875443 H&22209 secreted modular ca[ciun>binding protein 953 

439461 AA693960 Hs.1 03 158 ESTs 953 

418830 BE513731 Hs58959 Human DNA sequence from clone 967N21 on 952 

422763 AA033699 HSw83938 ESTs,Moderately J simiiartoMASP-2lH^a 952 

442739 NM.007274 Hs*679 cytosoncacylc^nzymeAttuoesterhydr 952 

65 452859 AI300555 H&288158 Homo sapiens cDNA: FU23591 fis, clone L 952 

403237 952 

415000 AW025529 Hs539812 ESTs, Weakly similar to CALM_HUMAN CALMO 951 

417951 AW976410 Hs2B9069 Homo sapiens cDNA:FU21016 fis, clone C 950 

419066 Z98492 Hs.6975 PRO1073 protein 950 
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422631 BE218919 Hs.1 18793 hypothetical protein RJ1068B 8.63 

410679 AW795196 H&215857 ring finger protein 14 8.63 

431585 BE242B03 Hs.262823 hypothetical protein FIJI 0326 8.62 

401851 8.62 

5 401866 8.62 

407783 AW996872 Hs.172028 a disintegrin and metalloproteinase doma 8.62 

408242 AA251594 Hs.43913 PIBF1 gene product 8.62 

422250 AW408530 Hs.1 1 3823 CIpX (caseinorytic protease X, E.coli) 8.62 

430259 BE550162 Hs.127826 RalGEF-like protein 3, mouse homotog 8.62 

10 452598 AI831594 Hs.68647 ESTs, Weakly similar to ALU7_HUMAN ALU S 8.62 

419541 AW749617 gb:RC3-BTQ5Q2-13010Q-O12-g07 BT0502 Homo 8.60 

428839 AI767756 Hs.82302 ESTs 8.60 

429328 AA828402 Hs.47939 ESTs 8.60 

451491 A1972094 Hs.286221 Homo sapiens cDNA FU13741 fis, done PL 8.60 

15 452561 AI692181 Hs.49169 KIAA1634 protein 8,60 

420027 AFD09746 Hs.94395 ATP^fodaig cassette, subfamily D (AID) 8.60 

435205 X54136 Hs.1 81 125 Immunogtobulln lambda locus 8.60 

430900 U91939 Hs248123 G protein-coupled receptor 25 8.60 

405074 859 

20 437991 AI479773 Hs.181679 ESTs 859 

436346 BE328882 Hs.1 93096 ESTs, Moderately similar to U1 INHUMAN U 858 

411079 AA091228 gb:cchn2152^eq.F Human fatal heart, Lam 857 

418452 BE379749 Hs55201 C-type (calcium dependent, carbohydrate- 856 

429109 ALQ08537 Hs.1 96352 . neutrophil cytosoUc factor 4 (40kD) 856 

25 448019 AW947164 Hs.195641 ESTs 856 

449865 AW204272 Hs.189371 ESTs 855 

431180 H55883 gb:yq94h03.r1 Scares fetal Rver spleen 854 

445988 BE007663 Hs.1 3503 inactivation escape 2 854 

405076 8 - 54 

30 407235 D20569 Hs.1 69407 SAC2 (suppressor of actin mutations 2, y 854 

414807 AI738616 Hs,77348 hydroxyprosteglandln dehydrogenase 15-{N 854 

. 425671 AF193612 Hs.1 591 42 lunatic fringe (Drosophila) homotog 854 

452413 AW082633 Hsi12715 ESTs 854 

421620 AA446183 Hs.91885 ESTs 853 

35 444539 AI955765 Hs.146907 ESTs 852 

415102 M31899 Hs.77929 excision repair CToss^XMnplementing rode 851 

405552 851 

418068 AW971155 Hs2939Q2 ESTs, Weakly similar to prolyl 4-hydroxy 850 

420133 AA426117 Hs.14373 ESTs 850 

40 438887 R68857 H&265499 ESTs 850 

446468 AI765890 Hs.1 6341 ESTs; Moderately similar to l!U ALU SUB 850 

446585 AV659397 Hs282948 ESTs 850 

441896 AW891873 gbOmfTO09CH)40500-1^bQ2 NTT0090 Homo 850 

437718 AI927288 Hs.196779 ESTs 8^8 

45 420656 AA279098 Hs.1 87636 ESTs 8.48 

429303 AW137635 Hs>44238 ESTs 8^8 

450624 AL043983 Hs.125063 Homo sapiens cDNA FU13825 fis, clone TH 8,48 

452573 AI907957 H&287622 Homo sapiens cDNA FU14082 fis, done HE 8.48 

456341 AA229126 Hs.122647 r^yristoyttransferase2 8.48 

50 423024 AA593731 Hs.75613 CD36 anfigen (coflagen type I receptor, 6-47 

446985 AL038704 Hs.1 56827 ESTs, Weakly similar to ALU1.HUMAN ALU S 6.46 

431778 ALO80276 Hs.268562 regulator of G-protein signalling 17 8.46 
400268 * 8.48 

421828 AWB91965 H&289109 dirrethytargtriine dimethylaminohydrolase 8.45 

55 417022 N&L014737 Hs.80905 Ras association (FtalGDS/AF-6) domain fam 8.44 

421029 AW057782 Hs.293053 ESTs 8.44 

425171 AW732240 Hs30Q615 ESTs 8.44 
459070 AI814302 gb:w}71c12jc1 NCLCGAP_Lu19 Homo sapiens 8.42 

406006 8.42 

60 412643 AW971239 Hs.293982 ESTs 8.42 

424775 AB014540 Hs.153026 SWAP-70 protein 8.42 

446848 AW136083 Hs.1 95266 ESTs, Weakly simSar to S59501 interfero 8.42 

448043 AI458653 Hs201881 ESTs 8.41 
407183 AA358015 gb£ST66864 Fetal lung III Homo sapiens 8.40 

65 412324 AW978439 Hs.69504 ESTs 8v40 

419594 AA013051 Hs£1417 topoisomerase (DNA) II binding protein 8.40 
430968 AW972830 gb:EST384925 MAGE resequences, MAGL Homo 8.40 

431689 AA305688 Hs£67695 UDP-GatbetaGlcNAc beta 1^-galactosyitr 8.40 

438582 AI521310 Hs.283365 ESTs, Weakly similar to ALU5.HUMAN ALUS 8.40 
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447685 AL122043 Hs.19221 hypothetical protein DKFZp566Q1424 8.40 

459119 AW844498 Hs289Q52 Homo sapiens LENG8 mRNA, variant C, part 8.38 

400817 8.37 

425265 BE245297 gb:TCBAP1E2482 Pediatric pre-B cell acut 8.37 

S 409385 AA071267 gbam61g01.r1 Stratagene fibroblast (937 8.36 

439121 BE047779 Hs.44701 ESTs 8.36 

419968 X04430 Hs.93913 interteukin 6 (interferon, beta 2) 83 

408327 AW182309 Hs249963 ESTs, Highly similar to dll 170K4.4 [H.sa 83 

403976 8.34 

10 448064 AA379036 gb£ST91 809 Synovial sarcoma Homo sapien 83 

442914 AW188551 Hs.99519 Homo sapiens cDNA RJ14007 fis, clone Y7 83 

428032 AW997704 Hs.11493 Homo sapiens cDNAFUl 3536 fc, clone PL 8.32 

434194 AF1 19847 Hs283940 Homo sapiens PRO1550 mRNA, partial cds 8.32 

458677 AW937670 Hs254379 ESTs 8.32 

15 420925 NMJ15698 Hs.1 00391 T54 protein 83 

416475 T70298 gb:ytr26g02.s1 Scares fetal five r spleen 83 

416852 AF283776 H&.80285 Homo sapiens mRNA; cDNA DKFZp585Cl723 (f B3 

430676 AF084866 gbiHomo sapiens envelope protein RtC-3 ( 83 

428455 A1732694 Hs.98520 ESTs 829 

20 435343 AW194962 Hs.199028 ESTs 83 

450783 BE266695 gb£01 19Q242F1 N1H_MGC_7 Homo sapiens CD 83 

404946 828 

422942 AF054839 Hs.1 22540 tetraspan2 83 

453716 AA037675 Hs. 152675 ESTs 83 

25 437098 AA744488 Hs.132842 ESTs, Moderately similar to ALU 1_HUMAN A 83 

443907 AU076464 Hs.9963 TYRO protein tyrosine kinase binding pro 827 

401930 AF106069 Hs3168 ublquRin specific protease 15 83 

446554 AA151730 Hs.301789 ESTs, Weakfy similar to similar to C.ete 83 

426290 AB007918 Hs.1 69182 K1AA0449 protein 63 

30 419904 AA974411 Hs.18672 ESTs 83 

413886 AW958264 Hs.103832 ESTs, WeaWy similar to TRHY_HUMAN TRICH 824 

424738 AI963740 Hs.46826 ESTs 624 

427359 AW020782 Hs.79881 Homo sapiens cDNA: RJ23006 fis, clone L 824 

424534 D87682 Hs.1 50275 K1AA0241 protein 824 

35 424429 U63830 Hs.146847 TRAF family member-associated NFKB activ 824 

442604 BE263710 Hs279904 ESTs 83 

442992 A1914699 Hs.1 3297 ESTs 83 

427210 BE396283 Hs.1 73987 eukaryotic translation Initiation factor 83 

457229 BE222450 HS266390 ESTs 821 

40 423730 AA330214 gb£ST33935 Embryo, 12 week II Homo sapl 821 

411928 AA888624 Hs.19121 adaptor-relatBd protein compiex 2, alpha 820 

416051 AA835668 Hs 25253 Homo sapiens cDNA: FU20935 fis, done A 820 

417231 R40739 Hs21326 ESTs 83 

422049 W25760 Hs.77631 glycine cleavage system protein H (amino 83 

45 427528 AU077143 Hs.1 79565 minichramosome maintenance deficient (S. 83 

458776 AV654978 Hs.1 9904 cystathionasa (cystathionine gamma4yase 8.19 

417687 AI828596 Hs250691 ESTs 8.18 

423218 NM.015896 Hs.167380 BLu protein / 8.18 

425397 JO4088 Hs.1 56346 topoisomerase (DNA) il alpha (170kD) 8.18 

50 406964 M21305 Hs247946 Human alpha satellite and satellite 3 |u 8.16 

402401 U42349 Hs.71119 Putative prostate cancer tumor suppress© 8.16 

423397 NMJJ01838 Hs.1652 chernokine (C-C motif) receptor 7 8.18 

427857 AL133017 Hs2210 thyroid hormone receptor interactor 3 - 8.17 

401519 8.17 

55 447188 H65423 Hs.17631 Homo sapiens cDNA RJ20118 fis, done CO 8.16 

424704 AI263293 Hs.1 52096 cytochrome P450, subfamily IIJ (arachldo 8.16 

435854 AJ278120 Hs.4996 DKFZP564D 166 protein 8.14 

448556 AW885606 Hs.5064 ESTs 8.14 

449217 AA27B536 Hs23262 ribonudeasa, RNase A family, k6 8.14 

60 453124 AI139058 Hs23296 ESTs 8.14 

442812 AI018406 Hs.131284 ESTs 8.14 

421129 BE439899 Hs3271 ESTs 8.14 
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TABLE 9A shows the accession numbers for those primekeys lacking a unigenelD in Table 
9. For each probeset we have listed the gene cluster number from which the oligonucleotides 
were designed. Gene clusters were compiled using sequences derived from Genbank ESTs 
and mRNAs. These sequences were clustered based on sequence similarity using Clustering 
and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers 
for sequences comprising each cluster are listed in the "Accession" column. 



Pkey: Unique Eos probeset identifier number 

CAT number Gene cluster number 

Accession: Genbank accession numbers 



410531 
410688 
410846 



410896 1226053J 



Pkey CAT number Accession 

408057 1035720^-1 A W1 39565 

408069 103655.1 H81795 Z42291 R20973 AA046920 

408182 104479 1 AA047854 AA0575O6 AA053841 

408338 1052148 J AW867079 AW887086 AW1 82772 

408828 108463 1 BE540279 AW410659 AA057857 R77693 BE278674 

409126 110159J AA063426 AW962323 AW4O8063 AA063503 AA772927 AW753492 BE1 75371 AA31 1 147 

409292 111586J AA071051 AA0705B4 AAD69938 AA102136 AA074430 

409314 11 1841.1 AA070266AA084967AA126998 

409385 112523 J AA071267 T65940 T64515 AA071334 

409398 1126716.1 AW386461 AW876408 AW386672 AW386599 AW876258 AW386619 AW386289AW876136AW876203AW876213AW876301 

AW876295 AWB76349 AW876365 AWB761 60 AW876369 AW876352 AW876271 

409671 114731J AA076769 AA076781 AI087958 

409768 1 154035.1 AW499566 AW502378 AW499522 AW502046 ATO02671 AW501917 AW501 868 AW501721 AW502813 

409841 1156088.1 AW502139 AW502432 AW502235 AW501683 AW502647 

409842 1156119.1 AW501756 AW502096 AW502465 AW501715 
1 156226.1 AW502327 AW502488 AW501829 AW502625 AW502687 
1207200J AW752953 H880448E156092 
1216101.1 AW796342 AW796356 BE161430 

AWB07057 AW807O54 AW807189 AW807193 AW807369 AW807429 AW807364 AWB07365 AW807078 AWB07256 AWB07180 
AW807331 

AW809637 AWB09897 AW81 0554 AWB09707 AWB09885 AW810O00 AW810088 AW809742 AW809816 AWB09749 AW809639 
AWB09722 AW809836 AW809774 AW810023 AWB10013 AW809813 AW809660 AW809728 AW809768 AW809951 AWB08657 
AW809954 

411079 123128.1 AA091228H71860 H71073 

411424 1245497.1 AWB45985 AW845991 AW845962 

411499 1248105.1 AW849292 AW849431 AW849422 AW849428 AW849420 AW849424 AW849427 

411507 1248607J AW850140AW850195 AW850192 

411534 1248827.1 AW850473 AW850471 AW850431 AW850523 

4119f72 1268491J BB074959 AW880160 

412110 1277844.1 AWB93569 AW893571 AW893586 AW893593 

412226 1284289.1 W26786 AW998612 AW9Q2272 

412257 1285376.1 AW903830BE071916 

412405 1293012 1 AW948126 AW948139 AW948196 AW948145 AW948162 AW948134 AW948127 AW948124 AW948153 AW948157 AW946125 

AW948131 AW948158 AW948164 AW948151 

413260 1356003.1 BE075281 BE075219BED75123BE075119BE075046 

413471 1371778.1 BE142098 BE142092 

413729 1385114J BE159999 BE160O56 BE160107 BE160139 

414182 142*9.1 AA136301 AI381776AA1 36321 

414989 1511339.1 T81668C19040C17569 

415354 1534763.1 F06495 R24336 R13046 

416011 1566439J H14487 R50911 Z43216 

416475 1596398.1 T70298 H58072 R02750 

417380 1672461.1 T06809 N75735 

419392 1843934.-1 W28573 

419541 185724.1 AW749617R64714AA244138AA244137BE094019 

419544 185760.2 AI909154 AA526337 AA244193 AI909153 

420819 196721 1 AA280700 AW975494 AA687385 

421245 200620 1 AA285363 AA285333 AA285359 AA265326 AA285350 

422673 219674.1 N59027AA314694 N53937 R08100 
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422695 
422858 
422940 
423730 
423790 
424385 
424606 
425265 



430876 321 68J 



430968 
431180 
432093 
434596 
436357 
437159 
437495 



219996 J AA315158 AW961298 N76067 AW802759 AI858495 W04474 

222209.1 R35398 BE252178 AA316153 

223106.1 BE077458 AA337277 AA3 19285 

231 462 J AA330214AW982519T54709 

232031.1 BE152393 AA330984 BE073904 

238731J AA339666 AW952809 AA3491 19 

241409 1 AA343936 AA344060 AW963081 

249175 J BE245297AA353976AW505023 

1 BE262745 

AF084866 AF0B4870 AF084864 ATO84867 AF084869 AF084865 AF084868 AW818206 AW812038 BE144813 BE144812 
AW812041 AW812040 AW812067 BE061583 BE061604 T05808 AI352469 AA580921 BE141783 BE141782 BE061601 
AW814393AW885029 

1 AW972830 AA527647 AA489820 AA570362 

328906 1 H55883 AW971249 AA493900 H55788 

341283.1 H28383 AW97267Q H2B359 AA525808 

38937J T59538 T59589 T59598 T59542 AF 147374 

41842 1 AJ132085Z83805 

43393 J ALO50O72AW900148 

43765.1 BE177778 BE177779 AL3901B0 AA359908 

46858.1 H66948AF085954 H66949 

46879.1 H56389AF085977H56173 

48675.1 BE410734 BE5601 17 BE270054 BE296330 BE267957 AI003007 BE545259 

52842.1 AW891873 AWB91897 BE564764 

645767 J AE45701 BE272724 

71288 1 BE617135 AW504051 AW504283 

74761J AA379036 AA150589 AI696854 BE621316 

84655.1 BE266695 BE265474 N53200 BE267333 

85673.1 AA215672 AI696628 AA013335 H86334 AA017006 

921802.1 AI907039AI907081 

922216 1 BEQ77084 AW139963 AW863127 AW806209 AW8062O4 AW806205 AW806206 AW80621 1 AW806212 AW806207 AW806208 
AW806210AI807497 

452712 928309.1 AW838616 AW838660 BE144343 AI914520 AW888910 BE184854 BE184784 

453758 980026 1 U83527 AL1 20938 U83522 

454093 1007366J AW860158 AW882385 AW8BD159 AWBB2386 AWB52341 AW821869 AW821893 AW062680 AW062656 

454563 1224342.1 AW807530 AW807540 AW807537 AW846086 BEH1634 AW846089 AW807499 AW807533 AW838499 

454791 1234759 1 BE071874 BE071882 AW820782 AW821007 

454977 1247099 1 AW848032 AW848630 AW848478 AW848623 AW848484 AW848169 AW848830AW848149 AW848119 AW848893AW848903 
AW848407 

455131 1254674.1 AW857913 AW857916 AW857914AW861627AW861626 AW861624 

455183 1259023 1 AW984111 AW863918 AW863856 

455254 1266449 J AW877015 AW877133 AW876978 AW877071 AW876988 AW877069 AW877063 AW877013 

455369 1285173.16 AW903533 AW903516 AW903562 BB0852O2 BE085215 BE085214 BBD85209 BE085172 BE085175 BE085193 BE085211 
BE085199 

455S82 1396849 1 BE1 76862 BE1 76876 BE1 76947 BE176878 

456011 1410860J BE24362B BE246081 BE247016 BE241984 BE241534 BE246091 BE245679 BE243620 BE245998 BE242329 BE241417 
BE241457 BE242522 BE241989 BE241464 

1416335 1 R00028 BE247630 

360505 1 AW062439 AW751554 AA579463 

364225.-1 AA584854 

399422.1 AI908236AA663731 
AI814302AI814428 

.1 W07808AI822066 

918957.1 AI903354AI903489AI903488 

921149.1 BE063380 BE063346 AI906097 

945240 -1 AI940425 



439120 
440134 
441896 
445629 
447229 
448064 
450783 
451045 
452549 



456023 
457586 
457595 
457751 
459070 
459081 
459145 
459172 
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TABLE 9B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in Table 9. For each predicted exon, we have listed the genomic sequence 
source used for prediction. Nucleotide locations of each predicted exon are also listed 

Pkey: Unique number corresponding to an Eos probeset 

Ref: Sequence source. The 7 digit numbers in this column ate Genbank Identifier (Gl) numbers. "Dunham I. et aL" refers to the 

publication entitled The DNA sequence of human chromosome 22/ Dunham I. et a!., Nature (1999) 402:489-495. 
Strand: Indicates DNA strand from which exons were predicted. 

NLposition: Indicates nucleotide positions of predicted exons. 



Pkey Ref 


Strand 


NLposHion 


400452 8113550 


Minus 


80308-90505 


400557 9801261 


Plus 


nnaira aasma maana aammq 

208453-208528,209633-20981 3 


400615 9908994 


Plus 


118036-118166,118681-118807 


400802 8567867 


Minus 


174571-174856 


400817 8569994 


Pius 


170793-170948 


400880 9931121 


Plus 


29235-29336,36363-36580 


400885 9958187 


Minus 


58242-58733 


400926 7651921 


Minus 


52033^158 ( 53955^4120 f 54957-55052^5420'55480 f 56452-56666 t 57221"57718 


400952 7658481 


Plus 


192667-192826,194387-194876 


400991 8096825 


Plus 


159197-159320 


401044 8117619 


Plus 


73501-73674 


401124 8570296 


Minus 


124181-124391 


401163 6981820 


Plus 


5302-5545 


401201 9743387 


Minus 


138534-138629,139234-139294,140121-140335,142033-142479 


401286 9801342 


Minus 


147036-147318 


401384 6850939 


Minus 


58360-58545 


401468 6433826 


Pius 


13056-13482 


401515 7630851 


Pius 


29929-30126 


401519 6649315 


Plus 


157315-157950 


401672 9838136 


Pius 


128526-128704, 130755-1 30860 


401744 2576349 


Pius 


14595-14751 


401851 7770425 


Minus 


146443-146664,147794-147971,148351-148480,148980-149111,149801-149949 


401866 8018106 


Pius 


73126-73623 


402240 7690131 


Pius 


104382-104527,106136-106372 


402359 9211204 


Minus 


4040341961 


402585 9908890 


Minus 


174893-175050,183210-183435 


402788 9796102 


Pius 


98273-101430 


402802 3287156 


Minus 


53242-53432 


402812 6010110 


Plus 


25026-25091,25844-25920 


402828 8918414 


Pius 


69071-69642 


402835 9187337 


Pius 
Minus 


26961-27101 


402838 9369121 
402842 9369121 


Minus 


32589-32735,35478-35666 
76355-76479 


402895 9967547 


Plus 


85537-85671,86379-85469 


402964 9581599 


Minus 


4882446784 


403137 9211494 


Minus 


92349-92572^2958-93084,93579-93712^3949-94072^4591-94748^5214-95337 


403237 7637807 


Pius 


7271-7527 


403259 7770585 


Plus 


46934857 


403683 7331517 


Plus 


217175-217446 


403690 7387384 


Minus 


78627-79583 


403708 5705981 


Minus 


134394-134812 


403838 4176355 


Pius 


19197-19502 


403851 7708872 


Plus 


22733-23007 


403976 7657840 


Pius 


24755-24969 


404407 7329316 


Minus 


4815448499 


404426 7407959 


Pius 


77842-77954 


404632 9796668 


Pius 


4509645229 


404741 8574139 


Pius 


143025-143467 


404756 7706327 


Plus 


82849-83627 


404946 7382189 


Plus 


134445-134750 


405074 7770440 


Pius 


4434044559,4479045059 


405125 8247873 


Pius 


137113-137814 


405172 9966752 


Pius 


153027-153262 
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405236 


7248076 


Minus 


151 699-151915 


405325 


6094661 


Minus 


25818-26380 


405411 


3451356 


Minus 


17503*17778 18021-18290 


405495 


8050952 


Minus 


72182-72373 


405552 


1552506 


Pius 


45199-45647 


405601 


5615493 


Minus 


147835-147935,149220-149299 


405685 


4508129 


Minus 


37956-38097 


405777 


7263187 


Minus 


104773-105051 


405856 


7653009 


Plus 


101777-102043 


405876 


6758747 


Plus 


39694-40031 


405932 


7767812 


Minus 


123525-123713 


405934 


6758795 


Plus 


159913-160605 


406006 


8247801 


Minus 


4264042776 


406134 


9163473 


Plus 


153291-153452 


406189 


7289992 


Minus 


22007-22234 


403422 


9256411 


Plus 


163003-163311 


405516 


7711422 


Minus 


128375-128449,128560-128784 


406538 


7711478 


Plus 


351 95-35367,3822&-38476,4008(M02 1 6.43522-43B40 


406554 


7711566 


Pius 


106956-107121 


406577 


7711730 


Plus 


11377-11509 
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TABLE 10: shows genes, including expression sequence tags differentially expressed in 
taxol resistant prostate tumor xenografts as compared to taxol sensitive prostate tumor 
xenografts. The genes are indicated as either being upregulated or downregulated during the 
induction of taxol resistance in sequential passages of the grafts. 



Pkey. Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelD: Unigene number 

Unigene Title: Unigene gene title 

Eos: Internal Eos name 

F00-F14: passage number 



Pkey ExAccn 


UnigenelD UnlgenTttle Eos Rssp.RX) 


F00 


F02 


F02 


F05 


F05 


F07 


-F09 


F10 


F11 


F13 


F14 


117921 N51DQ2 


Hs.47170 UprinA2 PM28UP 1 


9 


8 


9 


32 


20 


34 


122 


105 


82 


71 


111 


112971 T17185 


Hs.4299 ESTs CHA1 down 290 


281 


267 


335 


270 


284 


150 


157 


83 


89 


49 


75 


126645 A1167942 


Hs.61635 STEAP PAA5 down 106 


111 


103 


71 


34 


67 


33 


14 


2 


1 


1 


1 


119018 N95798 


Hs.179809 ESTs PAB2 down 765 


841 


757 


909 


742 


704 


478 


428 


253 


175 


228 


238 


110844 N31952 


Hs.167531 ESTs PAV7down175 


192 


147 


141 


123 


129 


73 


65 


55 


48 


54 


84 


100654 HG2841-HT2969 Hs.75442 Albumin, A PM01 down 666 


605 


504 


728 


357 


445 


602 


187 


117 


127 117 113 


100655 HG2841-HT2970 Hs.75442 Albumin, A PM02down 620 


653 


466 


688 


368 


386 


606 


175 


101 


95 115 97 


102076 U09579 


HS252437 cycfin-dep PM03down 101 


94 


143 


190 


105 


107 


68 


40 


34 


31 


46 


22 


1Q2208 U22961 


Hs.75442 albumin PMG4<fown495 


424 


323 


518 


252 


296 


467 


188 


169 


143 


165 


145 


103739 AA075779 


rnitochondr PMOSdown 75 


190 


606 


230 


378 


106 


218 


88 


69 


192 


69 


99 


107036 AA599690 


Hs.15725 SBBI46 PMOSdown 87 


124 


115 


188 


132 


111 


66 


71 


49 


70 


38 


50 


108242 AA062746 


ESTs PM07down 14 


20 


252 


13 


22 


43 


193 


10 


10 


104 


21 


18 


108282 AA0S5143 


solute car PMOSdown 27 


54 


178 


73 


108 


37 


53 


24 


14 


53 


15 


34 


108679 AA1 15963 


beta-1-gto PM09down680 


893 


1292 656 


869 


389 


1 


74 


118 


662 


359 


409 


108731 AA126313 


Hs.107476 ATPsynthaPMIOdownIO 


19 


185 


25 


60 


1 


32 


3 


7 


14 


1 


1 


110675 H89355 


Hs.6598 adrenergic PM1 1 down 207 


334 


237 


239 


231 


220 


119 


145 


93 


64 


56 


124 


115412 AA283804 


Hs.193552 ESTs PM12down 146 


316 


282 


271 


340 


334 


115 


236 


too 


196 


83 
14 


207 


115844 AA430124 


H&234607 MDM2 PM13down49 


93 


94 


154 


132 


91 


23 


54 


23 


76 


41 


120588 AA281591 


Hs.16193 ESTs PM14down80 


157 


58 


141 


159 


127 


39 


83 


35 


37 


16 


46 


132349 Y00705 


Hs.1 81286 serine pro PM15down146 


217 


214 


150 


106 


128 


177 


85 


54 


63 


63 


56 


132888 AA490775 


H&5920 N-acetylma PM16down 92 


150 


132 


178 


126 


139 


53 


94 


48 


67 


41 


60 


132967 AA032221 


Hs.61635 STEAP PM17down224 


208 


203 


215 


205 


180 


132 


65 


68 


50 


48 


63 
126 


133063 AA283085 


Hs.64065 ESTs PM18down85 


148 


161 


150 


92 


108 


42 


99 


42 


65 


29 


134374 D62633 


HS3236 ESTs PM19down230 


240 


194 


212 


231 


189 


89 


123 


107 


95 


68 


91 


135400 M23263 


H&99915 androgen r PM20down 36 


167 


99 


178 


132 


101 


23 


71 


26 


122 


14 


44 
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TABLE 1 1 : shows genes, including expression sequence tags that are up-regulated in 
prostate tumor tissue compared to normal prostate tissue as analyzed using Affymetrix/Eos 
HuOl GeneChip array. Shown are the ratios of "average" normal prostate to "average" 
prostate cancer tissues. 



Pkey: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelD: Unigene number 

Unigene Title: Unigene gene title 

R1: Background subtracted normal prostate : prostate tumor tissue 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



Pkey 

101336 
130642 
133512 
133436 
129292 
100610 
133448 
125193 

133456 
134546 

102131 
101375 

100674 
134365 
132335 
110303 
131678 



133769 
107904 
129427 
105987 
131466 



134626 
134170 
131713 
100748 
118769 
111734 
109221 
133846 
135281 
119073 
100760 
101426 
129568 
130900 
133879 
100627 
129424 

128652 
129979 



ExAccn 

L49169 
M63438 
X01677 
H44631 
X13810 

HG2566-HT4792 

M34516 

W67577 

T49257 
AA459310 

U15085 
M13560 

HQ30334fT3194 

R32377 

060387 

H37901 

N59162 

D80046 

M17733 

AA026648 

T80746 

AA406631 

F03233 

X00274 

S82198 

M63138 

X57809 

HG35174iT3711 

N74496 

R25375 

M192755 

AA480073 

AA401575 

R32894 

HG3576-HT3779 

M19483 

AA428Q25 

Z38466 

M13829 

HG27Q2-HT2798 



AA621245 
T72635 
X03068 
U67092 



129536 
133599 



M64788 



UnigenelD Unigene Title 

Hs.75678 FBJ murine osteosarcoma viral oncogene homoiog B 

Hs.156110 Immunoglobulin kappa variable 1D-8 

Hs.195188 glyceraldehyde-3i)hosphate dehydrogenase 

Hs.737 immediate early protein 

Hs.1 101 POU domain; class 2; transcription factor 2 

Microtubule-Associated Protein Tau, Ail SpOcet, Exon 8 
Hs.1701 16 immunoglobulin lambda-tike polypeptide 3 

Hs.84298 C074 antigen (Invariant polypeptide of major histocompatibility 

complex; class II antigen-associated) 
Hs.183704 ubiquttinC 

Hs£518 Homo sapiens mRNA; cDNA DKFZp586L1722 (from clone 

DKFZD586L1722) 

Hs.1 162 major histocompatibility complex; class (I; DM beta 

Hs.84298 CD74 antigen (invariant polypeptide of major histocompatibility 

complex; class 11 antigen-associated) 

SpDceosomal Protein Sap 62 
Hs.82240 syntaxin3A 
Hs.189885 ESTs 
Hs.32706 ESTs 
Hs3Q542 ESTs 
HSJ250879 ESTs 
Hs.75968 thymosin; beta 4; X chromosome 

Hs.61389 ESTs 
Hs.1 1 1334 ferritin; fight polypeptide 

Hs.1 10299 mitogen-activated protein kinase kinase 7 

H&271B9 ESTs 

Ha.76807 Human HLA-DR alpha-chain mRNA 

HsJ709 catdecrin (serum calcium decreasing factor; elastase IV) 

Hs.79572 cathepsinD (lysosomal aspartyl protease) 

Hs.181 125 Imrnunogtobulin lambda gene cluster 

AJpha-1-Antitrypstn, 5 1 End 

ESTs 

Hs.126916 ESTs 

H&8584Q ESTs; Weakly similar to slac [H^apiens] 

Hs.76719 U6 snRNA-assodated Sm-like protein 

Hs£7757 . ESTs 

H&45514 v-ets avian erythroblastosis virus E26 oncogene related 

Major Histocompatibility Complex, Class li Beta W52 
H&25 ATP synthase; H+ transprtng; mitochndri F1 complex; beta potypept 

Hs.1 14360 transforming growth factor beta-stimulated protein TSC-22 

H&21036 ESTs; Moderately similar to F25965_3 [H^apiens] 

Hs.77183 v-raf murine sarcoma 3611 viral oncogene homoiog 1 

Serine/Threonine Kinase (Gb225424) 
Hs.1 11301 matrix metalloproteinase 2 (gelatinase A; 72kDgelatinase; 

72kD type (V collagenase) 
Hs.103147 ESTs; Weakly similar to similar to SP:YR40_BACSU [Oelegans] 

Hs.13956 ESTs 

Hs.73931 major histocompatibility complex; class II; DQ beta 1 

Human ataxia-telangiectasia locus protein (ATM) gene, exons 
1a, 1b, 2, 3 and 4, partial cds 

Hs.184504 tryptase; alpha 

Hs.75151 RAP1; GTPase activating protein 1 

194 



R1 

0.012 

0.015 

0.017 

0.017 

0.019 

0.02 

0.021 

0.022 
0.022 

0.023 
0.023 

0.023 

0.024 

0.027 

0.027 

0.028 

0.028 

0.029 

0.029 

0.03 

0.03 

0.03 

0.032 

0.032 

0032 

0.033 

0.034 

0.034 

0.034 

0X136 

0.035 

0.036 

0.037 

0.037 

0.037 

0.038 

0.038 

0.039 

0.039 

0.039 

0.039 
0.039 
0.039 
0.04 

0.04 
0.04 
0.041 
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65 



102104 


U12139 




131340 


AA478305 


Hs.25817 


130446 


X79510 


Hs.155693 


101352 


L77701 


Hs.16297 


122593 


AA453310 


liO. IcJOIHo 


130181 


R39552 


H&151608 


134071 


214093 


Hs.78950 


108129 


AA053252 


Hs, 185 848 


130511 


L32137 


He i^RA 


133336 


AA291456 


Hs.71190 


132982 


L02326 


H&198118 


131880 


AA047034 


HS33818 




U35234 


Up 1 CQMA 

ns.io?oj4 


133467 


AA258595 


Hs.73931 


101191 


L20688 


H&83656 


101860 


M95610 


H&37165 


102799 


U88898 




107200 


D20350 


H&5628 


101166 


L14927 


Hs2099 


134289 


M54915 


H&81170 


135329 


AA436Q26 


HS.98858 


124950 


T03786 


Hs.151531 


102919 


X12447 


Hs.183760 


100574 


HG2279+IT2375 




131286 


AA450092 


Hs.25300 


102675 


U72512 




131332 


R50487 


Hs.25717 


101634 


M57731 


Hs.75765 


113118 


T47906 


H^220512 


1248B4 


R77276 


Hs.120911 


130523 


W76D97 


Hs214507 


110244 


H26742 


HS25367 


131932 


AA454980 


HS25601 


132509 


H09751 


H&5038 


133372 


AA291139 


Hs.72242 


100817 


HQ4011-HT4804 




106746 


AM76438 


Hs.7991 


135401 


L14813 


Hs.169271 


130479 


R44163 


Hs.12457 


102589 


U62015 


Hs*867 


121521 


AM12165 


H&97358 


135340 


AA425137 


HsS9093 


132336 


AA342422 


Hs.45073 


115368 


AA282133 


H&88960 


101278 


L38487 


Hs.110849 


103284 


X80200 


H&8375 


100564 


HQ2239-HT2324 




133132 


240883 


8*85588 


121811 


AA424535 


Hs.98416 


129613 


AA279481 


H&238831 


132468 


S79854 


Hs.49322 


120111 


' W95841 


Hs.136031 


103668 


Z83741 


Hs£48174 


130386 


F10874 


HS234249 


104275 


CQ2170 


Hs.39387 


106305 


AA436146 


Hs.12828 


116431 


AA609878 


Hs.55289 


120339 


AA206465 


HS256470 


114427 


AA017063 




118821 


N78070 


Hs.94789 


118979 


N93788 


Hs.43666 


107495 


W78776 


Hs.90375 


120240 


241732 


Hs.66049 



Human ajphal (XI) collagen (C0L1 1 A1) gene, S region and axon 1 

Homo sapiens chromosome 19; cosmid R27216 

protein tyrosine phosphatase; non-receptor type 21 

COX 17 (yeast) homotog; cytochrome c oxidase assembly protein 

alpha-methytacyl-CoA racemase 

Homo sapiens done 23622 mRNA sequence 

branched chain ksto add dehydrogenase El; alpha polypeptide 

(maple syrup urine disease) 

ESTs; Weakly drnBar to II ALU SUBFAMILY J WARNING 
ENTRY I! [H^apiens] 

canUage oligomeric matrix protein (pseudoachondrDpiasia; 

epiphyseal dysplasia 1; multiple) 

ESTs 

tmmunogtobufin lambda-lite polypeptide 2 
RecQ protain-like 5 

protein tyrosine phosphatase; receptor type; S 
major histocompatibility complex; class II; DQ beta 1 
Rho GDP dissociation inhibitor (GDI) beta 
collagen; type IX; alpha 2 

Human endogenous retroviral H proteaseMegTase-derived ORF1 
mRNA, complete cds, and putative envelope prut mRNA, partial ods 
ESTs 

EpocaGn 1 (protein migrating faster than albumin; tear prealbumin) 

pim-1 oncogene 

ESTs 

protein phosphatase 3 (formerly 2B); catalytic subunit beta isoform 

(calcineurin A beta) 

aldolase A; fructose-bisphosphate 

Triosephosphate Isomerase 

Homo sapiens dones 24718 and 24825 mRNA sequence 
Human B-ceH receptor associated protein (hBAP) alternatively 
spliced mRNA, partial 3'UTR 
ESTs 

GR02 oncogene 
ESTs 
ESTs 
ESTs 

ESTs; Weakly similar to ALR (risapiens] 
crirornodomain heOcase DNA binding protein 3 
neuropathy target esterase 
ESTs 

Dystrophin- Associated Glycoprotein, 50 Kda, Aft Splice 2 
ESTs 



Homo sapiens done 23770 mRNA sequence 
cysteine-rich; angiogenic inducer; 61 
EST 

Homo sapiens chromosome 19; cosmid R28379 
ESTs 

ESTs; Weakly similar to simitar to collagen [Cetegans] 
estrogen-related receptor afcha 
TNF receptor-associated (actor 4 
Potassium Channel Protein.(Gb21 1585) 
ESTs; Weakly similar to dJ393P12^ [H^apiens] 
ESTs 

ESTs; Weakly simBar to collagen alpha 1(XVI1I) chain [M.muscu!us) 
deiodinase; todothyronme; type III 
ESTs 

H2A histone family; member M 
mltogerwicirvated protein kinase 8 interacting protein 1 
ESTs; Weakly smlr to weak smtriry to ribosomal prot L14 [Cetegans] 
ESTs 

ESTs; Weakly smlr to 1 10 KD CELL MEMBRANE GLYCOPROTEIN [H-sapfens] 0,813 

ESTs; Highly similar to Mlz-1 protein [H^apiens] 0*05 

ESTs oss 

protein tyrosine phosphatase type IVA; member 3 0.05 

ESTs 0.051 

ESTs 0.051 

195 



0.041 
0.041 
0.042 
0.042 
0.042 
0.042 

0.042 

0.043 

0.043 
0.043 
0.044 
0.044 
0.044 
0.044 
0.044 
0.044 

0.044 
0.044 
0.044 
0.044 
0.044 

0.044 
0.044 
0.045 
0.045 

0.045 
0.045 
0.046 
0.046 
0.046 
0.046 
0.046 
0.046 
0.046 
0D46 
0.047 
0047 
OM 
0.047 
0.047 
0.048 

om 

0.048 
OJ048 
0.048 
0.043 

om 

0.048 
0.048 
0.049 
0.049 
0.049 
0.049 
0.049 
0.049 
0.05 
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114331 Z41309 Hs.12400 ESTs 0.051 

130947 R40037 H&21506 ESTs 0.052 

129242 WB1679 Hs5174 r&osonral protetn S17 0.052 

131413 AA482390 Hs56510 ESTs; Modly smtr to vacuolar prot sorting homotog r-vps3& [R.rtorvegteus] 0.052 

5 112304 R54788 Hs^6239 ESTs 0.052 

101416 M17254 Hs.45514 v-ets avian erythroblastosis virus E26 oncogene related 0.052 

131201 AA428304 Hs£4174 ESTs 0.052 

101054 K024O5 Hs.73933 Human MHC class II HLA-DQ-beta mRNA (DR7 DQw2); complete cds 0.052 

101306 L41143 Hs.232069 T-cell leukemia translocation altered gene 0.053 

10 129311 T55087 yb45o08/1 Stratagena fetal spleen (#937205) Homo sapiens cDNA 

done IMAG&74126 5*. mRNA sequence. 0.053 

129942 U95301 Hs. 144442 phospholipase A2; group X 0.053 

119210 R93340 Hs.92995 ESTs 0.053 

101046 KO1 160 Accession not feted In Genbank 0.053 

15 114086 Z38266 Hs.12770 Homo sapiens PAC clone DJ0777O23 from 7p14-p1 5 0.053 

110171 H19964 Hs£1709 ESTs 0.053 

101004 J04101 Hs348109 v-ets avian erythroblastosis vires E26 oncogene homotog 1 0.053 

129715 N58479 Hs.12126 ESTs; Weakly similar to LR8 [H.sapiens] 0.053 

101581 M34996 Hs.198253 major hlstarr^ttol% ^ 0.053 

20 113285 T66830 Hs. 18271 2 ESTs 0.053 

127537 AA569531 Hs, 162859 ESTs 0.054 

100813 HG3995-HT4265 Cpg-Enrlched Dna, Clone S19 0.054 

101841 M93107 Hs.76893 3-hydroxybutymte dehydrogenase (heart; rmtocfiondrial) 0.054 

135053 R77159 Hs*93678 ESTs 0.054 

25 101419 M17886 Hs.177592 ribosomal protein; large; PI 0.054 

119724 W69468 Hs.47622 ESTs 0.055 

102673 LT72509 Human alternatively spliced B8 (B7) mRNA, partial sequence 0.055 

129877 AA248589 Hs.13094 ESTs; Weakly similar to ORF YGR101W [Sxerevisiae] OJ055 

114788 AA156737 Hs. 103904 EST 0.055 

30 123812 AA620607 Hs.1 11591 ESTs 0.055 

117669 N39237 Hs.44977 ESTs 0.055 

123782 AA610111 Hs.162695 EST 0.055 

102395 U41767 Hs.92208 a (fisintegrin and metalloproteinase domain 15 {metargidin) 0.055 

133795 M12529 Hs.1 69401 apolipoprotein E 0.055 

35 123193 AA489228 Hs.136956 ESTs 0.056 

132595 AA253369 Hs. 155742 glyoxytate reductase/hydroxypyruvate reductase 0.056 

104161 AA456471 Hs.7724 K1AA0963 protein 0.056 

115330 AA281145 Hs.88827 ESTs 0.056 

112893 T08000 Hs. 194684 bassoon (presynaptic cytomatrix protein) 0.056 

40 133475 L29217 Hs.73987 CDOfike kmase 3 0.056 

128699 K03207 Hs.1 03972 profine-rich protein BstNl subfamily 4 0-056 

102940 X13956 Hs£4998 Hu 12S RNA induced by pctyfrt); polyfjC) and Newcastle disease virus 0.056 

131299 AA431464 Hs.25426 ESTs; Weakly similar to unknown [H^apiens] 0.057 

102495 U51240 Hs.79356 Lysosomal-assocjated muitispanning membrane protein-5 0.057 

45 129594 R70379 Hs.115396 Human germBne IgD chain gene; C>regton; C-delta-1 domain 0.057 

118593 N69020 Hs.207689 EST O057 

126702 U54602 H&2785 keratin 17 0.057 

124386 N27368 Hs.212414 sema domain; Immunoglobulin domain (lg); short basic domain; 

secreted; (semaphorin) 3E 0.057 

50 130538 M20786 Hs.159509 aIpha-2-plasmIn Inhibitor 0.057 

114299 Z40782 Hs.22920 similar to S68401 (cattle) glucose Induced gene 0.057 

115604 AA400378 Hs.49391 ESTs 0.057 

106052 AA416947 Hs.6382 ESTs; Highly similar to WAA0612 protem [Rsaplens] 0.057 

131730 U05681 HsJ31210 B-csU CLUlymphoma 3 0.057 

55 131285 AA479498 H&25274 ESTs; Modly smtr to putative seven pass transmembrane prot [H^apiens] 0.058 

129705 X7B706 Hs. 12068 carnitine acetyitransferase 0.058 

123175 AA489010 Hs.178400 ESTs 0.058 

103592 Z30844 Hs.1 23059 chloride channel Kb 0.058 

118196 N59478 Hs.48396 ESTs; Moderately similar to tumor necrosis factor-alpha 

60 -Induced protein B12 [Ksapiens] 0.058 

104886 AA053348 Hs.144626 growmdifferenfetiontactor11 0.058 

104250 AF000575 Hs.105928 leukocyte imrrumcfltobullivlike 

and ITIM domains); member 3 OXJ58 

113301 T67452 Hs.13104 EST 0.058 

65 110441 H503Q2 Hs.19845 ESTs; Highly smlr to prot phosphatase 2A BR gamma subunit (H^apfens] 0.058 

125297 Z39215 Hs.159409 ESTs 0.058 

135258 AA292423 Hs.97272 ESTs; Weakly similar to dJ281 HB.2 [Rsaptens] 0.058 

130633 T92363 Hs.178703 ESTs 0.058 

112006 R42607 Hs^2241 hypothetical protein 0.058 
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30805 
134907 
32619 
35115 
100531 
24530 
19960 



01076 
30655 
34458 



32878 
21828 
33418 
29317 
30153 
24403 



3)814 
31770 
17557 
03522 
20029 
02135 
123617 
12136 
33725 



i 06555 
1 23269 



29375 
35271 



29364 



01012 
34791 
33700 
23887 
29363 
05719 



17437 

32741 
134437 
107664 
20844 
01574 
131219 
03495 
129607 
106467 
28841 
100515 
19332 
134516 
35012 
03575 
15514 



U12194 
D80002 
AA404565 
N35489 

HG1872-HT1907 

N62256 

W87533 

AA478999 

L04270 

N92934 

AA192614 

AA401452 

AA026793 

AA425166 

U76366 

N46244 

D85815 

N31745 

AA668123 

W20070 



Y10514 

W91960 

U15460 

AA609183 

R46100 

V00563 

U09196 

AA455000 

AA491226 

AA166837 

AA263028 

W79850 

AA397763 

W90398 

AA477106 



AA219179 



10505 
33912 
29581 



K01396 

AA621065 

H057O4 

AA291644 

H62396 

N27645 

AA394133 

M26041 

AA010594 

AA349417 

M34182 

C00476 

Y09022 

AA404594 

AA45004O 

T16358 

HG1723-HT1729 

T54095 

AA171939 

X73608 

Z26256 

AA297739 

AA321355 
H55992 
X62744 
M33600 



Hs.170238 
Hs. 178292 
Hs.53447 
Hs.94653 

Hs.102727 



Hs.1116 

Hs.17409 

Hs33577 

Hs*32Q60 

Hs.58679 

Hs.98497 

Hs.172727 

Hs.1 10373 

Hs.15114 

Hs.102493 

Hs.134170 

Hs.168625 

Hs31833 

H&44532 

H&250640 

Hs.41691 

H&181131 

Hs.9739 

HS.17B543 

Hs£2520 

Hs.16725 

Hs.105280 

Hs.72620 

Hs.111076 

Hs.11081 

Hs*7562 

Hs.6147 

Hs.1 10757 

Hs.1 12471 

Hs.19105 

Hs.697 

Hs.89655 

Hs.75621 

H&1 12943 

Hs.110746 



Hs.190266 



H&55898 

Hs.198253 

HS5326 

H&96917 

Hs.158029 

Hs.24395 

Hs.153591 

Hs.1 1607 

Hs.154162 

Hs.106443 



Hs23413 
Hs.93029 

H&55609 



H&20495 
Hs.77522 
Hs.180255 



sodium channel; voltage-gated; type I; beta polypeptide 0.058 

K1AA0160 protein 0.058 

ESTs; Moderately similar to kinesin light chain 1 [MjttuscuIus] 0.058 

neurochondrin 0.058 

Major Histocompatibility Complex, Dg 0.058 

EST 0.05B 

ESTs; Moderately similar to UV-1 protein [H^apiens] 0.056 

KIAA0906 protein 0.058 

lymphotoxin beta receptor (TNFR superfamfly; member 3 0.058 

cysteine-fich protein 1 (intestinal) 0.058 

cysteine and glyrine-rich protein 3 (cardiac UM protein) 0.058 

ESTs 0.059 

ESTs; Weakly similar to 4F2/CD98 Bght chain [M.musculus] 0.059 

ESTs 0.059 

Treacher ColBns-FranceschetQ syndrome 1 0.059 

ESTs 0.059 

ras homolog gens family; member D 0.059 

ESTs 0.059 

ESTs 0.059 

K1AA0979 protein 0.059 

ESTS 0.06 

diubiquitin 0.06 

Rsapfens mRNAfor C0152 protein ~ 0j06 

sequence-specific singie-stranded-DNA-binding protein 0.06 

activating transcription factor B 0.06 

ESTs 0.06 

ESTs 0.061 

immunoglobulin mu 0.061 

Hu 1.1 kb mflNAupregltd in retinotcacid treated HL-60 neutrophHic cells 0.061 

ESTs 0.061 

ESTs; Weakly similar to dJ963K212 (Haptens] 0.061 

DKFZP434I114 protein 0.061 

maiate dehydrogenase 2; NAD (mitochondrial) 0.061 

ESTs; Weakly similar to HPBR1I-7 protein [H^apiens] 0.061 

ESTs 0.061 

KIAA1075 protein 0.061 

DNA segment on chromosome 21 (unique) 2056 expressed sequence 0.061 

ESTs 0.061 

transtocase of inner mltochondriai membrane 17 (yeast) homolog B 0.061 

cytochrome c-1 0.062 

protein tyrosine phosphatase; receptor type; N 0.062 

protease inhibitor 1 (anfrelastase); eipha-1-eflftrypstn 0.062 

ESTs 0.062 

H sapiens HCR (a-heiix cofled-coii rod homotogue) mRNA; complete cds 0.062 

ESTs 0.062 

ESTs 0.062 
yw5e3.s1 Weizmann Olfactory Epithelium H sapiens cONA clone 

IMAGE255676 3 1 smtr to contains L1 .13 L1 repeSive element *„ mRNA seq 0.062 

ESTs; Highly similar to OASIS protein [M.muscutus] 0.062 

major histoccrnpatibtBty complex; class II; DQ alpha 1 0.062 

ESTs; Moderately similar to pim-1 protein [H .sapiens] 0.062 

ESTs 0.062 

protein kinase; cAMP-dependent; catalytic; gamma 0X62 

small inducible cytokine subfamily B (Cys-X-Cys); member 14 (BRAK) 0.062 

Not56 (D. meianogaster)#e protein 0.062 

ESTs 0.062 

ADP-fibosytation factor-fike 2 0.062 

ESTs 0.062 

Macrophage Scavenger Receptor, Alt Splice 2 Oj062 
ESTs; Weakly similar to ll ALU SUBFAMILY J WARNING ENTRY I! [H .sapiens] 0.062 

ESTs 0.062 

sparc/osteonectin; cwcv and kazal-iike domains proteoglycan (testJcan) 0.063 

H^apiens isoform 1 gene for L-type calcium channel, exon 1 0.063 
ESTs; Weakly similar to ISOLEUCYL-TRNA SYNTHETASE; 

CYTOPLASMIC [H^apiens] 0.063 

EST2393 Bone marrow Homo sapiens cONA 5' end, mRNA sequence 0.063 

DKF2P434F011 protein 0.063 

major histocompatibSiry complex; class II; DM alpha 0.063 

major histocompatibility complex; class II; DR beta 1 0.063 
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R38280 Hs.150922 BCS1 {yeast homologate 

AA397825 H&5307 synaptopodm 

AA410617 Hs.178009 ESTs 

050495 Hs.80598 transcription elongation factor A (Sll); 2 

D42053 Hs.75890 sfte-1 protease (subtilisln-tika; sterol-reguJated; cleaves sterol regulatory 

element binding proteins) 

061259 Hs.6529 ESTs 

AA521488 Hs.90998 WAA0128 protein 

X74794 Hs.154443 minichromosome maintenance deficient (S. cerevisJae) 4 

AA102489 . Hs.173484 ESTs 

AA070473 zm7c8.s1 Stratagene neuroeplthelium (#937231) Homo sapiens cDNA 

clone IMAGE'5399 3*. mRNA sequence 

F10815 Hs.12373 KIAA0422 protein 

TS7464 Hs.94617 ESTs; Weakly similar to predicted using Qenefinder [Celegans] 

AA436856 Hs.98910 ESTs 

AA457129 Hs.6455 RuvB (E coll homobgj-fiee 2 

T58607 ya94aQ2,s1 Stratagene placenta (#937225) Homo sapiens cDNA clone 

IMAQE£9290 3*, mRNA sequence. 

AA429290 Hs.17719 ESTs 

Y12661 Hs.171014 VQF nerve growth factor inducible 

AA054087 Hs.18858 phospholipase A2; group IVC (cytosoiic; calcium-independent) 

Y10141 Haptens DAT1 gene, partial, VNTR 

U40671 Hs.100299 figase III; DNA; ATP-dapendent 

AA417821 HS237924 ESTs; Highly similar to CGl-69 protein (H^apiens] 

AA457735 Hs£50 IMP {inosine monophosphate) dehydrogenase 1 

R23146 HS23466 ESTs 

H57060 Hs.108268 ESTs 

X80198 Hs.77628 steroidogBnic acute regulatory protein related 

W80730 H&28355 ESTs 

N93465 Hs.110453 ESTs; HJghty similar to CG1-38 protein fH.sapiens] 

N74597 Hs.1 80535 ESTs; Weakly similar to mitogen inducible gene mig-2 [H^apiens] 

AA036794 Hs.95196 ESTs; Weakly similar to T20B123 [Celegans] 

T10792 Hs.172098 ESTs 

AA406O83 Hs.98007 ESTs 

T16275 Hs.106359 ESTs 

AA456933 Hs,174481 ESTs 

AF015910 Homo sapiens unknown protein mRNA, partial cds 

AA282757 HsJ9040 prepronociceptin 

AA480109 Hs.9963 TYRO protein tyrosine kinase binding protein 

R08548 H&251651 EST 

R53109 Hs.247382 dimathyiarginine dimeflTyiaminohydrolase 2 

J05037 Hs.76751 serine dehydratase 

U8Q226 Human garnma-aniinobutyric acid transaminase mRNA, partial cds 

R31652 Hs.621 blglycan 

F02322 H&26135 ESTs 

T12559 H&221382 ESTs 

AA156597 Hs556441 EST; Moderately simiiar to CGU36 protein [H^apiens] 

T10316 H&4302 ESTs 

AA256073 Hs.1 90626 ESTs 

AA278412 H&21346 ESTs; Weakly similar to F42C57 gene product [CelegansJ 

M87789 Hs.140 immunoglobulin gamma 3 (Qm marker) 

H03387 Hsl241305 estrogen-responsive B box protein 

H93721 H&20798 ESTs 

AA400138 H&97703 ESTs 

U12707 H&2157 Wiskott-Aldrich syndrome (ecezema-mrornbocytopenia) 

U24183 Hs.75160 phosprrofructoldnase; muscle 

Z39079 H&8021 K1AA1058 protein 

D51267 HS&148 rfcosomal protein S1 2 

T87708 H850098 ESTs 

AA096014 Hs.9527 ESTs; Highly similar to HSPC01 3 [Ksapiens] 

Human amfloride-sensitive epithelial sodium channel gamma subunit mRNA, 
5* end, partial cds 

Hs.77225 ADPwfoosyKransferase (NAD*; poly (ADP-ribose) polymerase)-!!)® 1 

AA203321 Hs.151696 DKFZP727G051 protein 

D87462 Hs.106674 BRCA1 associated protein-1 {ubiquitin carboxy-terminal hydrolase) 

AA262029 Hs.88218 ESTs 

N66046 Hs.141605 ESTs 

N20392 Hs.42846 ESTs 

H83380 Hs.32757 ESTs 
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KruppeWlke factor 4 (guf) 0.069 

DOM^(C.etegans)homologZ 0.069 

C079B antigen OmmunoglobuBn-assodated beta) 0.069 

ESTs 0.069 

ProtoOnoogene OMyc AIL Splice 3, Orf 114 0.069 
ye20f05.s1 Stratagene lung (#937210) H sapiens cONA clone IMAGE: 
3' similar to contains Atu repetitive element^ontalns MER12 repetitive element; 

mRNA sequence. 0.069 

Zinc Finger Protein (Gb^88357) 0.069 . 

ESTs 0.069 

complement component 2 0.069 

interieuldn 15 receptor; alpha 0.07 

collagen-binding protein 2 (colligen 2) 0.07 

plasminogen activator inhfoitor; type I 0.07 

ESTs; WeaWy similar to T25G3.1 [Celegans] 0.07 

ESTs 0.07 

CD14 antigen 0.07 

ESTs 0X7 

ESTs; Weakly similar to ACROSIN PRECURSOR [h\sapiens] 0.07 

ESTs 0.07 

eukaryofic translation irHlfation (actor4E brndlng protein 1 0.07 

ESTs 0.07 
ESTs; Weakly similar to F55A12.9 [Celegans] 0.071 

mannosidase; alpha; class 2B;member1 0.071 

ESTs; WWy smlr to D ALU SUBFAMILY J WARNING ENTRY U[H^apIens] 0.071 

ESTs 0.071 
collagen; type II; alpha 1 (primary osteoarthritis; spondyloepiphyseal 

dysplasia; congenital) 0.071 

ESTs 0.071 

ESTs 0.071 

ESTs 0.071 

ESTs; WWy smir toll ALU SUBFAMILY SX WARNING ENTRY [H-sapiens] 0X71 

Accession not listed in Genbank 0X71 

dejpamine receptor D4 0^71 

EST 0.071 

Homo sapiens done 24940 mRNA sequence 0X71 

small nuclear ribonudeoprotein polypeptide C 0X72 

Mucin 1, Epithelial, Aft. Splice 9 0.072 

ESTs 0.072 

noosomal protein L18a 0X72 

ESTs 0.072 

Homo sapiens done 24432 mRNA sequence 0X72 

ESTs; Weakly stmOarto WASP-lan% protein [Rsapiens] 0.072 

Human 12*9 transcript of prearranged irnrnuriogtobuBn V(H)5 pseudogens 0X72 

calnexln 0.072 > 

ESTs 0X72 

EST 0X72 

ceruloplasnin (ferroxldase) 0X72 

ESTs; Highly similar to K1AA0476 protein (H^apiens] 0X72 

ESTs 0X72 

ESTs 0.072 

ESTs 0X72 

Homo sapiens mRNA; cDNA DKFZp564C188 (from clone DKFZp564C186) 0.072 

ESTs; Weakiy similar to hypothetical protein [H^ap iens] 0X72 

ESTs 0.073 

otoferlin 0.073 

synuctein; gamma (breast cancer-specific protein 1 ) 0.073 

EST 0X73 

ESTs 0X73 

KIAA0346 protein 0.073 

ESTs 0.073 

Mucin <Gb:M22406) 0.073 

ESTs 0.073 

KIAA082B protein 0.073 

Human alpha-1 collagen type II gene, exons 1 , 2 and 3 0X73 

ESTs 0.073 

DKFZP434B103 protein 0X73 

K1AA0211 gene product 0.073 

ESTs 0X73 
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133361 


DQ0O7O 

HZBZ79 


nS./ io4o 


134714 


U89922 


ns.oyu 




129805 


T86796 


Ue 1O0Q7K 

MS.ldiio/o 




120421 


A AOOCMCC 

AAZdOlDO 






100885 


U f* A AO fYJJT A OTC 






102789 


U8&759 


nS.lOOddO 




120139 


700070 


Ue 7707ft 

nS.//o/0 




135238 


U76343 


Ue OC07fi 




129618 


N54845 


HS.17dQ3U 




132960 


A AC/V17/IQ 

AAEQ9742 


Ue cicn 
nS.Ol5U 




108751 


AA127063 


Ue 0f\07<7 

hS-ZUo/ 1 / 


134060 


D42039 


Ue 70Q74 

nS.7oo71 




111338 


N79778 


Ue OEAQ/4 




112345 


R5B8B0 


Ue OCCCO 




126456 


W00881 




/in 
4U 


128937 


Z39939 


Ue -tfrrne 
HS. 10726 




103485 


v no Ann 
Y084O9 






111202 


N66280 


HS.1079Z2 




132625 


a a inn on r\ 

AA429890 


Ue 4 cenco 
HS.looOoo 


A< 


103434 


X98085 


Ue C/ZOO 

H&54433 


102616 


U65581 


MS.159191 




102667 


U70867 


||_ AAM1 

Hs. 83974 




111422 


R01127 


HS.19104 




101411 


M16938 


Hs.820 


3U 


113267 


T65058 


HS- 12725 


103559 


Z19585 


Hs.75774 




131588 


AA258613 


Ue OA4 0Q 

H&291o9 




107B21 


AAQ2U991 


He 179fl^ 
MS. 1/6030 




134278 


H82839 


HSJ1001 


55 


120893 


AA369800 


Hs57058 


108786 


AA128999 






106890 


AA489245 


HsJ8500 




119760 


W72267 


Hs^8219 


60 


132999 


Y00787 


Hs.624 


129156 


AAQ28195 


Hs.108973 




121171 


AA400008 


Hs.161814 




103864 


AA207264 


Hs.181077 




128591 


AA255537 


Hs.102057 


65 


122172 


AA435753 


Hs.161854 


112802 


R97647 


Hs.174855 




107723 


AA015967 


Hs.60680 




113011 


T23737 


Hs.1600 




131279 


AA089853 


Hs^5197 




103190 


X70083 


Hsi58414 



I membrane (neutral sphingomyefinase) 0.077 

KIAA0255 gene product 0.077 

Homo sapiens clone 643 unknown mRNA; comptete sequence 0X178 

ESTs 0.078 

Interferon; gamma-inducible protein 30 0.078 

ESTs 0.078 

WAA0296 gene product O.078 

ESTs 0.078 

ESTs; Moderately similar to KIAA0544 protein [Haptens] 0XJ78 

glycine recepton beta 0.078 

ESTs 0.078 

even-skipped homeo box 1 (homoiog of Drosophiia) 0.07B 

ESTs; Weakly similar to sphingosine kinase [M.muscuKis] 0.078 

ESTs 0XJ78 

ESTs 0.078 

EST 0.078 

EST; Weakly similar to hypothetical protein [H-sapiens] 0.078 

ESTs 0.078 

ESTs 1 0.078 

ESTs 0.078 

ESTs; Highly stmHarto HYPOTHETICAL PROTEIN KIAA0195 [H-sapiens] 0^)78 

protein with poiygiutamine repeat 0.078 

ESTs * 0.078 

Human clone 23546 mRNA sequence 0.078 

lymphotoxin beta (TNF supertamSy; member 3) 0.078 

ESTs; Weakly similar to predicted using Genefinder [Cslegans] 0.079 

ESTs; Weakly similar to chondromoduBn-l precursor [H.saplens] 0.079 

Prafoie-Rich Protein Prt^.ADele 0.079 

netrin2(chW(enHto om 
Human DNA from chromosome 19-spedfic cosmid R30923; genomic sequence 0.079 

Human Iver GABA transport protein mRNA; 3 1 end 0.079 

ESTs 0.079 

KIAA0521 protein 0.079 

ESTs 0X79 

KIAA0081 protein 0U79 

extracetlular matrix protein 2; female organ and adipocyte specific 0.079 

ESTs 0X179 
za56d02Jl Scares fetal fiver spleen 1NFLS Homo sapiens cDNA done 

IMAGE296547 5', mRNA sequence. 0.079 

ESTs 0.079 

thyroid hormone responsive SPOT14 (rat) homoiog 0.079 

ESTs 0.079 

dsptatin resistance associated 0.079 

tenascin R (restrictin; janusin) 0.079 

nbosomal protaln LMkB 0.079 

solute carrier family 21 (prostaglandin transporter); member 2 0.079 

ESTs 0.079 

homeo box C6 0.08 
ESTs; Weakly similar to I! ALU SUBFAMILY J WARNING ENTRY II [Haptens] 0.08 

thrombosporufin 4 0.08 

K1AA1021 protein 0.08 

ESTs 0^)8 

ESTs; Weakly similar to DY3£ [C.elegans] 0.08 

EST; Highly simitar to CMP-N-acetyineuraminic acid hydroxylase [H^apiens] 0.08 
zo8f 12.S1 Stratagene neuroepOheiium NT2RAMI 937234 Homo sapiens 

cDNA clone IMAGE5671 19 3 1 , mRNA sequence 0.08 

KIAA1066 protein; JSAP1 homoiog (mouse); JIP3 homoiog (mouse) 0.08 

ESTs 0.08 

interteukin8 0-08 
dolichyl-phosphate mannosyttransferase polypeptide 2; regulatory subunit 

ESTs 0.08 

ESTs; Weakly similar to Miller-Oleker Dssencephary gene IKsapiens] 0.08 

ESTs; Weakly similar to O-Gnked GlcNAc transferase [H.sapiens] 0.08 

EST 0.08 

EST 0.08 

EST 0.08 

chaperonin containing TCP1 ; subunit 5 (epsilon) 0X381 

ST1P1 homology and U-Box containing protein 1 0.081 

fiiamin C; gamma (actnvbindtng protein-280) 0.081 
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10 



15 
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103956 


AA292411 


Hs.233348 


112706 


R89828 


Hs.1 38493 


126126 


M85370 




130094 


H43286 


Hs 167017 


100800 


HG3945-HT4215 




108675 


AA1 15240 


Hs 61816 


190490 


AA234259 


He qQft 16 


129666 


M7734.Q 


Hs 1187B7 




(VJ930U/ 


Hs.943 


130536 


T17045 


Hs 159492 




AA0161B1 

fVwtu ID 1 


Hs.59752 


123071 


AA482593 




1 liXU/ 


T90457 


He 1Q19Q3 


101250 


L34060 


Hs79133 


122521 


AA449433 


Hs 149227 


13391 A 

1309 l*r 


N32811 


Hs.77542 


102038 


UQSB59 


Hs477 


1 1 uooo 


Hirmfl 
FTH/ooo 


Hq 174094 

no. 1 1'HJVH 


1 1000/ 


N70274 




11/ ODD 


NUiftfid 
noiooa 


He 04/119 




H87671 


He 1R999H 

no. 1 OfcQfcU 


lUUOOl 


ri7jwfti 

U/OOOI 


He 19ftf»7ft 


11907A 


T179Q1 
t Heal 


He 1H1174 


1 JgOOZ 


nftO^oo 

l/0O**O£ 


He 


^OOflOO 


70CMQQ 


He 0.701 

n&o/o 1 


110070 

1 10&/£ 


100000 


He 19P/I7 
ns. i£o\it 




AAncoeoo 
MMLTOOOOt 


He 9R774 


111061 


N58054 


Hs. 36859 


19Q9ftQ 

l^atoa 




He 1R3.WI 


102453 


U48437 


Hs.74565 


19ft 9m 


MHJOwOOO 


He 10490ft 
no. io*t too 




nonftftft 


He 4R9fl3 


128656 


AA219552 


H&204144 


112776 


R95850 


Hs.34494 


1/15404 


AAOftft97*3 

nntOOt/O 


He 9Q9RP. 


1 l/UUU 


Hftd71fl 

no**/ 10 


He 11990ft 


112656 


R65260 


Hs.133151 


128963 


J03890 


Hs.1074 


110857 


U70909 


H&39960 


lUllQ/ 


l\UO*MU 




1010>tfi 
ltliWO 






lonooo 
10wO££ 


1I0AC47 
W10U04/ 


He 9001 


122743 


AA458674 


Hs.99478 


114569 


AA063316 




<ooo*7n 


1 t7fW71 
U/UOH 


ns.4oouy 


IQolZO 


A AACOOC1 


Lie AT/! IO 
nSA/410 


< moon 


V/MOOC 

A04oZ5 


U e 0C7Q 


1 15305 




Hs.88599 






He 906704 


135017 


AA249586 


Hs.9315 


123776 


AA610071 


Hs.112813 


114454 


AA021091 


Hs.226208 


101246 


L33799 


HS202097 


107366 


U78310 


Hs.13501 


132779 


T89601 


Hs.95497 


129709 


AA1 12209 


Hs.1209 


115244 


AA278767 


Hs.914 


123253 


AA490878 


Hs.111334 


128469 


T23724 


HS558677 


132220 


AA431847 


Hs.42409 


111664 


R17939 


Hs.22344 


102354 


U38268 




112828 


R98774 


Hs.194338 



ESTs 
ESTs 

EST01684 Fetal brain, Stratagene (cat#936206) Homo sapiens cONA 

clone HFBCH10, mRNA sequence. 

gamma-aminobutyric add (GABA) B receptor; 1 

Phosphorrpkf Transfer Protein 

ESTs 

ESTs 

transforming growth factor; beta-induced; 68kD 

natural Idler ceB transcript 4 

spastic ataxia of Charievoix-Saguenay (sacsin) 

ESTs 

ESTs 

ESTs 

cadhsrin 8 

ESTs; Weakly similar to PR0UNE-RICH PROTEIN MP-3 [M.musculus] 
ESTs 

hydroxysteroid (17-beta) dehydrogenase 3 



0.081 
0.081 

0.081 
0.081 
0.081 
0.081 
0.081 
0.081 
0.081 
0.081 
0.081 
0X81 
0.081 
0.081 
0.081 
0.081 
0.081 

ESTs; Weakly similar to li ALU SUBFAMILY J WARNING ENTRY !! [H^apiens] 0.081 
ESTs 0.081 
ESTs 0X82 
ESTs; WeaWy similar to Mouse 195 mRNA; complete cds [Mjnusculus] 0.082 
Human mRNA tor ornithine decarboxylase arrtizyrna; ORF 1 and 0RF 2 0.082 
rrucrotubule-assodated protein tau . 0.082 

KIAA0148 gene product 0.082 
Homo sapiens BAG clone RG1 18D07 from 7q31 0X82 
ESTs 0.082 
ESTs * 0.082 

ESTs 0.082 
ribosomal protein L18a 0.082 
amyloid beta (A4) precursor-like protein 1 0.082 
ESTs 0X82 
ESTs 0X82 
ESTs; Modfy smlr to tumor necrosis factor-alpha-induced prot B12 [H^apiens] 0.082 
ESTs 0X82 
Homo sapiens mRNA; cDNA DKFZp434P174 (from done DKFZp434P174) 0X82 
ESTs; WeaJdy similar to repressor protein [H^apiens] 0.082 
transient receptor potential channel 7 0X82 
surfactant; pulmonary-associated protein C 0.083 
ESTs 0.083 
Human complement Clq B-chain gene, exon A+1 0.083 
ESTs 0.083 
thromboxane A synthase 1 (platelet; cytochrome P450; subfamily V) 0X83 
EST 0X83 
zm2d1 £1 Stratagene corneal stroma (#937222) Homo sapiens cDNA done 
JMAGES129473' simila/ to TR:E198281 E198281 THIOREDOXIN 
REDUCTASE contains Aiu repetitive element;, mRNA sequence 0.083 
ataxin 2 related protein 0.083 
ESTs 0X83 
gap Junction protein; beta 1; 32kD (connexin 32; Chanxt-Marie-Tooth 
neuropa%; X-I'tnked) 0.083 
ESTs 0.083 
ESTs - 0X83 

ESTs; Weakly similar to NEURONAL OLFACTOMEDIN-RELATED 
ER LOCALIZED PROTEIN tH.sapisns) 0.083 
ESTs 0.083 
ESTs 0X83 
procollagen C-endopeptidase enhancer 0.083 
pescadHIo (zebrahsh) homotog 1; containing BRCT domain 0.083 
ESTs; Weakly similar to GLUCOSE TRANSPORTER TYPE 5; 
SMALL INTESTINE (Rsapiens] 0.083 
acyl-Coenzyme A dehydrogenase; long chain 0.083 
Human mRNA for SB dasstl histocompatibility antigen alpha-chain 0.083 
ferritin; Bght polypeptide 0.083 
EST 0.083 
ESTs; Highly similar to CGM46 protein [H .sapiens] 0.083 
ESTs 0X83 
Human cytochrome b pseudogene, partial cds 0.084 
ESTs 0.084 
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110410 


H47B68 


H&34024 


102620 


U66052 




102550 


U58087 


Hs.14541 


108417 


AA075716 




113299 


T67285 


Hs.13089 


117669 


N49947 


Hs.46990 


113734 


T98484 


Hs.1 8377 


133325 


C00424 


Hs.7101 


123368 


AA505022 


Hs.124838 


101615 


M55153 


Hs.8265 


119352 


T65972 


Hs.1 93365 


123828 


AA620686 


Hs.1 12884 


1(0611 


238133 


Hs.1 13973 


131289 


AA485697 


Hs25334 


128678 


T15896 


Hs.103535 


130814 


AA256695 


Hs.19813 


133391 


X57579 


Hs.727 


129322 


AA437153 


Hs/1 10407 


109284 


AA196995 


Hs.86092 


116689 

1 IOW3 


F09222 


Hs.66099 


100545 


HG2147-HT2217 




102634 


U66711 


Hs.77667 


111735 


R25389 


Hs23856 


105181 


AA19087B 


Hs.1 0974 




AA455350 


H&99401 


114543 


AA056121 


Hs.158419 


133597 


AA425908 


Hs.75139 


121064 


AA398647 


Hs.97406 


122331 

1 1 1. CO 1 


AA436369 


Hs 197728 


imam 


D50550 


Hs 95659 


101727 


M73481 


Hs.73883 


131226 


AA1 65400 


Hs.24476 


IOOOQU 


AA095041 


Hs 181073 


102792 


U87964 


Hs£27576 


104976 


AA086460 


Hs.183669 


120865 


AA350631 


Hs^6963 


infinan 


AA418046 


H&35124 


128571 


AA416619 


Hs.101661 


101838 


M92934 


Hs.75511 


128514 


H84261 


Hs.100843 




WVhmmI 


Hs.79 








11fiQfi7 
1 IOJJO/ 


HfifBSfi 
nouooo 


Hs.40124 


110053 


H125B6 


H&B9563 


i i*wyo 


nnUU/OlO 


Hs.1 10155 


wmoo 


V¥«**KKJ 1 


H&251385 






Hs.75323 


112544 


R70948 


HS29153 


111423 


R01165 


Hs.188507 


127918 


AA806043 


Hs.1 15396 


107300 


T40348 


H&90488 


134847 


R51194 




124579 


N68345 


Hs.127179 


130471 


Z68280 


Hs.183706 


116596 


D60755 


Hs.92955 


105069 


AA136345 


H&23617 


102491 


U51010 




130069 


AA055896 


Hs.146428 


130234 


AA280413 


Hs.157441 


120540 


AA262992 


Hs.96417 


122508 


AA449221 


HS20432 



ESTs 

Human done W2-6 mRNA from chromosome X 
cuIDnl 

zm89e5.8l Stratagene ovarian cancer (#937219) H sapiens cDNA clone 

IMAGE54512 3' similar to gb:X14723 CLUSTERIN PRECURSOR 

(HUMAN);, mRNA sequence 

ESTs 

ESTs 

EST 

periodontal ligament fibroblast protein 
ESTs 

transglutaminase 2 (C polypeptide; protein-gtutamine 

-gamma-glutamyl transferase) 

ESTs; Moderately simSar to aitamatively spliced product 

using exon 13A [H^apiens] 

EST 

myosin; heavy polypeptide 8; skeletal muscle; perinatal 
ESTs; Weakly simOar to ION CHANNEL HOMOLOG RIC 
PRECURSOR [Mjmisculus] 
ESTs 
ESTs ' 

inhibln; beta A {activin A; actMn AB alpha polypeptide) 

ESTs; Weakly simflar to coded for by C. elegansrcONA yk173c12£ [Oelegans] 

ESTs 

ESTs 

Mucin 3, Intestinal (Gb:M55405) 

lymphocyte antigen 6 complex; locus E 

ESTs; Weakly simflar to FAST kinase [H^apiens] 

ESTs; Moderately similar to unknown [Rjioivegicus] 

EST 

ESTs 

partner of RAC1 (arfaptln 2) • 
ESTs 

ESTs; Weakly similar to ZINC FINGER PROTEIN 135 [H^apiens] 

lethal giant larvae (Drosophila) homotog 1 

gastrin-releasing peptide receptor 

ESTs 

ESTs 

GTP binding protein 1 

ESTs; Weakly similar to U ALU SUBFAMILY J WARNING ENTRY II [H.saptens] 

EST 

ESTs 

ESTs 

connective tissue growth factor 

ESTs; WeaWy similar to similar to GfTP-binding protein [Cjelegans] 
aminoacytase 1 

Rab geranyla^ranyttransferase; alpha subunit 
EST 

nuclear cap binding protein 1; 80kD 
ESTs 

murine retrovirus integration site 1 homotog 



ESTs 
ESTs 

Human germGne IgD chain gene; Oragion; C-delta-1 domain 
ESTs 

yj71a0B.r1 Scares breast 2NbHBst Homo sapiens cDNA clone IMAGE154166 

5' similar to gb!1 1284 DUAL SPECIRCITY MTTOGEN-ACTIVATED PROTEIN 

KINASE KINASE 1 (HUMAN);, mRNA sequence. 

ESTs; Weakly similar to TERATOCARCINOMA-OERJVED GROWTH 

FACTOR 1 (H^apiens] 

adducln 1 (alpha) 

ESTs 

ESTs; Weakly similar to ZFOC1 gene product [H-sapiens] 

Human nicotinamide N-methyltransterase gene, exon 1 and 5* flanking region 

collagen; type V; alpha 1 

spleen focus forming vims (SFFV) provtra) integration oncogene spit 

ESTs 

ESTs 



0.034 
0.084 
0.084 



0.084 
0.084 
0.084 
0.084 
0.084 
0.084 

0.084 

0.084 
0.084 
0.084 

0.084 
0.084 
0.084 
0.084 
0.084 
0.084 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0X85 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0X85 
0X85 
0X85 
0.086 
0.086 
0X86 
0.086 



0.086 

0.086 
0.086 
0.086 
0.086 
0.086 
0X86 
0X86 
0X86 
0.086 
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128054 AI205718 Hs.125416 ESTs 0.086 

133020 AA053248 Hs.185182 ESTs; Highly similar to 40S RIBOSOMAL PROTEIN S10 [Rsaplans] 0.086 

130056 AA017356 Hs.171900 armadillo repeat gene deletes in velocardiofacial syndrome 0.086 

130504 U48865 Hs.158323 CCAAT/enhancer binding protein (C/EBP); epsiion 0.066 

5 133978 W73859 H&78061 transcription factor 21 0.086 

105265 AA227941 H&26088 ESTs 0.086 

133035 T15965 Hs.6333 ESTs 0.086 

100768 HG3636-HT3846 Myosin, Heavy Polypeptide 9, Non-Wusde 0.086 

129338 T56800 Hs.47274 Homo sapiens mRNA; cDNA DKFZp564B176 (from clone DKFZp564B176) 0.086 

10 132789 W23761 Hs.56876 ESTs 0.086 

116099 AA456309 Hs.58831 regulator of Fas-Induced apoptosis 0.086 

100721 HQ3355-HT3532 Peroxisome Profiferator Activated Receptor (GbZ30972) 0.087 

112569 R73150 Hs.75270 QTP-binding protein homologous to SaccharornycescerevisaeSKJ4 0.087 

130645 AA020942 Hs.17200 STAM-like protein containing SH3 and IT AM domains 2 0.087 

IS 100751 KG3527-HT3721 Luteinizing Hormone, Beta Subunft 0.087 

134550 M27161 Hs.85258 CD8 antigen; alpha potypeptide (p32) 0.087 

130885 AA338646 Hs£0912 adenomatous polyposis coli Tike 0.087 

101446 M21302 Hs.56306 small praline-rich protein 2A 0.087 

116287 AA487856 Hs.155829 KIAA0676 protein 0.087 

20 134034 X89267 Hs.78601 uroporphyrinogen decarboxylase 0.087 

130860 U66061 Hs241395 protease; serine; 1 trypsin 1) 0.087 

109901 H04992 Hs30499 ESTs 0.087 

107537 Z20777 Hs,9857 EST s; Weakly similar to peroxisomal srmrt-chairr alcohol 

dehydrogenase [H^apiens] 0.087 

25 133232 AA496030 Hs.6845 ESTs 0.087 

108559 AA085161 zn12c5.s1 Stratagene hNT neuron (#937233) H sapiens cDNA done 

IMAGE54728 3 1 similar to TRS1 151228 G1 151228 LPG1 P. •„ mRNA seq 0.087 

121288 AA401735 HsJ7340 EST 0.087 

108844 M132916 Hs.177961 Human Chromosome 16 BAC clone CIT987SK-A-388D4 0.087 

30 129874 AA406488 Hs.181551 ESTs 0.087 

105139 AA164543 Hs.110082 ESTs 0.088 

124789 R43803 Hs.78110 ESTs; Weakly similar to F1 7A92 [C.elegans] 0.088 

115923 AA441929 H&38205 ESTs 0.088 

123640 M609292 Hs.1 12681 ESTs OJ088 

35 131607 AA351409 Hs.1 72740 miaotubule-associated protein; RP/EB family; member 3 0.088 

130064 T67053 Hs.181125 immunoglobuBn lambda gene cluster 0.088 

108752 AA127070 Hs.71055 ESTs 0D88 

124249 H68077 Hs.108211 ESTs 0.088 

100109 AJ000480 Hs.143513 phosphoprotein regulated by mitogenic pathways 0.088 

40 104642 AA004662 Hs.1 84245 KIAA0929 protein Msx2 interacting nuclear target (MINT) hcmolog 0.088 

131752 AA453311 Hs£1568 ESTs 0.088 

114727 AA132545 Hs, 190202 ESTs 0.088 

120965 AA398089 Hs.179715 ESTs 0.088 

100396 D84361 Hs.1 5 1123 Human mRNA for p52 and p64 isoforms of N-Shc; complete cds 0.088 

45 106218 AA428451 H&91146 DKFZP586E0820 protein 0.088 

111562 R09567 Hs.187569 ESTs 0.088 

121219 AA400606 Hs.144344 EST 0.088 

101187 L20316 Hs.208 glucagon receptor 0.088 

101513 M28210 H&27744 RAB3A; member RAS oncogene family 0.088 

50 116454 AA621071 Hs.42034 ESTs; Moderatety sMar to T-complex protein 10A [H^apiens] 0.088 

116171 AA463434 Hs.42658 ESTs 0.089 

117500 N31909 Hs.44278 ESTs 0.089 

119978 W88623 Hs.59190 EST . 0.089 

132005 D58231 Hs. 173091 DKFZP434K1 51 protein 0.089 

55 109914 H05529 Hs.194704 leudne-rich; glioma inactivated 1 0.089 

130370 M55265 Hs.1 551 40 casein kinase 2; alpha 1 polypeptide 0.089 

104262 AFQ09801 Hs.105941 bagpipe homeobox (Drosophfla) homolog 1 0.089 

129708 AA417181 Hs.120858 ESTs 0.089 

106398 AA447545 Hs.18268 adenylate kinase 5 0.089 

60 120884 AA365356 Hs.97041 ESTs 0.089 

130404 X72012 Hs.76753 endogtin (Oster-Rendu-Weber syndrome 1) 0.089 

114072 Z38184 Hs.123633 ESTs 0.089 

131470 X54938 Hs.2722 inositol 1;4;5-trisphosphate 3-kinase A 0.089 

124573 N67935 Hs.194703 adaptor-related protein complex 4; mu 1 sub-unit 0.089 

65 114717 AA131240 Hs.252014 EST 0.089 

133806 M12759 H&76325 Human lg J chain gene 0.09 

130470 AA398552 Hs.15711 KIAA0639 protein 0.09 

133182 Z80787 H&240135 H4 histone family; member J 0.09 

116036 AA452572 Hs.43866 ESTs 0.09 
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132404 


AA393903 


Hs.4768 




122695 


AA456048 


Hs,99403 




125975 


AA495891 


Hs.152290 




110783 


N23669 


HS26407 


5 


129860 


AM10343 


Hs.129826 




120740 


AA302650 


HSX6654 




119564 


W38206 






134474 


AA054746 


HsX379 


10 


119014 


N95435 


Hs£5144 


109791 


F10669 


Hs.13228 




117605 


N35073 


Hs.44433 




121589 


AA416627 


H&191598 




104326 


D81655 


Hs.143067 


15 


129861 


N69507 


Hs.129849 


102795 


U88667 


Hs.198396 




119626 


W49499 


Hs.184456 




110516 


H56894 


Hs^7368 




105382 


AA236853 


H&111801 


20 


123754 


AA609964 


Hs.102021 


108008 


AA039430 


H&61920 




121057 


AA398619 


Hs.142375 




123675 


AA609474 


Hs.1 12713 




135194 


C20975 


H&9613 


25 


127070 


AA641812 


Hs.190037 


134051 


S67070 


Hs.78846 




133382 


Ml 12532 


Hs.7247 




103615 


Z46967 


Hs.1 15460 




118457 


N66593 


Hs.49230 


30 


118504 


N67334 


Hs50158 


112915 


T10176 


Hs.4254 




132088 


AA470121 


Hs243960 




101504 


M27288 


H&248156 




112550 


R71391 


H&29074 


35 


128551 


H09058 


Hs237323 


112879 


T03541 


Hs.1 15960 




127079 


A1364691 


Hs.128628 




101993 


U01062 


Hs.77515 




113020 


T23830 


Hs.7303 


40 


120465 


AA251505 


Hs.130861 


130152 


US2645 


Hs.151139 




104941 


AA065169 


Hs.17805 




110090 


H16076 


Hs.6915 




135375 


AA480688 


H&99741 


45 


123799 


AA620418 


Hs.1 12861 


118966 


N93438 


Hs.76907 




116969 


H80633 


Hs, 143038 




125147 


W38150 






100836 


HG4113-HT4383 




50 


114726 


AA132509 


Hs.103827 


107311 


157738 


Hs.174112 




112863 


T03148 


Hs.4610 




129290 


AA521407 


Hs.1 10095 




103384 


X92762 


Hs.78021 


55 


112508 


R68213 


H&28847 




111863 


R37495 


Hs^3578 




131184 


AA452705 


HS23954 




107420 


W26567 


Hs.4775 


60 


111768 


R27606 


Hs.24185 


112290 


R53940 


Hs£6016 




130581 


AA481982 


Hs.16258 




120744 


AA302772 


HsJ>28649 




112226 


R50761 


Hs£5738 


65 


116154 


AA460951 


Hs.57100 


102640 


U67674 


Hs.194783 




129797 


X53595 


Hs.1252 




102705 


U77180 


HSJ00Q2 




132408 


AA035547 


Hs.47822 




108441 


AA079079 





ESTs 0.09 

ESTs; Moderately simBar to undulln 2 [Haptens] 0.09 

ESTs; Highly slntiJarfo PACAP type-3/VIP type-2 receptor (Rsapiens] 0.09 

ESTs 0.09 

tetraspan transmembrane 4 super family 0.09 

EST 0.09 

Accession not listed in Genbank 0.09 

ests 0x9 

ESTs 0.09 

DRE-antagonist modulator; calsenHin 0.09 

ESTs 0.09 

ESTs 0.09 

ESTs 0X» 

DKFZP564M182 protein 0X9 

ATF-binding cassette; sub-family A (ABC1); member 4 0X9 

ESTs; WWy smlr to 1! ALU SUBFAMILY SX WARNING ENTRY 1! [Rsapiens] 0.09 

EST 0X19 
Homo sapiens mRNA; cDNA DKFZp564H2023 (from done DKFZp564H2Q23) O09 

ESTs 0X9 

ESTs 0X9 

ESTs; Moderately similar to putative envelope protein [Rsapiens] 0.091 

EST 0.091 

ESTs; Highly similar to angiopoatin-retated proSm [H^apiens] 0X91 

ESTs 0.091 

heat shock 27kD protein 2 0X91 

ESTs 0X91 

caiictn 0.091 

EST 0.091 

ESTs 0X91 

ESTs 0X91 

HLA-B associated transcript-3 0.091 

oncostatlnM 0.091 

ESTs 0.091 

N-acetyiglucosamine-phosphate mutase; DKF2P434B187 protein 0.091 

ESTs 0.091 

ESTs; Moderately similar to CL3BC [Rjiorvegteus] 0.091 

inositol 1 ;4;5-triphosphate receptor; type 3 0.091 

ESTs; Weakly similar to PROHBITIN [Rsapiens] 0.091 

ESTs 0.091 

E74-Cke factor 4 (ets domain transcription factor) 0X91 

ESTs 0X91 

ESTs 0.091 

ESTs; Weakly similar to BRAIN PROTEIN H5 [Rsapiens] 0X91 

ESTs 0X92 

ESTs; Highly similar to HSPC002 (H^aplens] 0X92 

ESTs 0X92 

Accession not fetad in Genbank 0X92 

Olfactory Receptor Or17-201 0X92 

EST 0.092 

ESTs 0.092 

EST 0X92 

ESTs 0.092 
tafazzin (cardiomyopathy; dilated 3A (X-Cnked); endocardial 

fibroelastosis 2; Barth syndrome) 0.092 

ESTs 0.092 

ESTs 0.092 

ESTs; WeaWy similar to KIAA0584 protein [Rsapiens] 0X92 

ESTs 0.092 

ESTs 0.092 

ESTs 0.092 

ESTs; Weakly similar to RAS-RELATED PROTEIN RAB-5A [Rsapiens] 0.092 

EST 0.093 

ESTs 0X93 

ESTs 0.093 

solute carrier family 10 (sodium/blle add ^transporter family); member 2 0.093 

apollpoprotein H (beta-2-glycoprotein I) 0.093 

small Inducible cytokine subfamily A (Cys-Cys); member 19 0.093 

KIAA0380 gene product; RhoA-specific guanine nucleotide exchange factor 0.093 
zm97c9.s1 Stratagene colon HT29 (#937221) Homo sapiens cDNA done 
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35 



40 



45 



50 



55 



60 



65 



IMAGE545872 3 1 similar to contains element MER22 MER22 repetitive 

element ;, mRNA sequence 0.093 

108145 AA054133 Hs.63085 ESTs 0.093 

106466 'AA449990 Hs.76057 lysophospholipase II 0.093 

5 101697 M64358 Human rhom-3 gene, exon 0.093 
121294 AA401958 H&240170 ESTs; Moderately similar to alternatively spBced product using 

exon13A[H.sapiens] 0.093 

N49065 Hs.125201 ESTs; Weakly similar to B7 [M.musculus] 0.093 

AA422049 Hs.40780 ESTs 0.093 

U33053 H&2499 protein kinase (Mike 1 0.093 

U79255 Hs26468 amyloid beta (A4) precursor protein-binding; family A; member 2 (XH-like) 00)93 

T10069 Hs.101094 ESTs 0.093 

H41281 Hs.107619 ESTs 00)93 

H&28788 ESTs 0.093 

Hs.159637 vatyHRNA synthetase 2 0.093 

AA447954 Hs.6311 ESTs 0.093 

R22891 Hs.7093 ESTs 0.094 

N34933 Hs.44654 EST 0.094 

W45174 H&31382 ESTs 0.094 
AA01B449 Hs.125220 Homo sapiens DNA tram chromosome 19-cosmids R301 02^29350^27740 

containing MEF2B; genomic sequence 0.094 
AA250743 HsJ2198 ESTs; Highly similar to cateum-regulatBd heat stable protetn 

CRHSP-24[Rsapiens] ~ 0.094 

H94949 Hs.171955 trophinlrvassisting protein (tastin) 0.094 

F02429 Hs.6581 ESTs 0X94 

112592 R77631 H&29126 ESTs 0.094 

102314 U34038 Hs.154299 coagulation factor II (thrombin) receptor-like 1 0X94 

116128 AA459915 Hs.1 12193 mutS (E coD) homolog 5 0.094 
106809 AA479704 Hs.220324 Human DNA sequence from clone 283E3 on chromosome 1p36.21-36.33. 

Contains the alternatively spficed gene for Matrix Metalloprotainase tn the 
Female Reproductive tract M1FR1 ; -2; MMP21/22A; -B and -C; a novel gene; 



117824 
115771 
102303 
131405 
112909 
124173 
112488 
130554 
106413 
111711 
117595 
113813 
107769 

114986 

130297 
25 109589 



30 



10 



15 



20 









the alternatively spliced CDC2L2 gene for 


0.094 


130607 


AA043894 


Hs.16603 


ESTs 


0.094 


120592 


AA281929 


Hs.143974 


ESTs 


0.094 


117230 


N20535 


Hs.43265 


melastatin 1 


0.094 


105948 


AA404597 


HsJ133 


ESTs 


0.094 


101333 


147738 


Hs.80313 


p53indudble protein 


0.094 


101909 


S69265 . 




Homo sapiens mRNA for PLE21 protein; complete cds 


0.094 


106959 


AA497031 


Hs.8657 


ESTs; Highly similar to CTG7a [H^apiens] 


0.094 


127034 


AA352389 




ESTs; Wklysmtrto glucose-6-phospnatase catalytic subunft [Rnorvegicus] 


0.095 


134430 


H52105 


HS.8309 


WAA0747 protein 


00195 


120342 


AA207105 


H&45066 


Homo sapiens mRNA; cDNADKFZp434!143 (tram clone DKFZp434l143) 


0.095 


104450 


177564 


Hs.103978 


serine/threonine kinase 22B (spermiogenesis associated) 


0.095 


130902 


AA424530 


H&21061 


ESTs 


0.095 


102708 


U77594 


Hs.37682 


retinoic acid receptor responder (tazarotene induced) 2 


00)95 


107373 


U85773 


Hs.154695 


phosphomannomutase 2 


0.095 


123569 


AA608952 


Hs.195292 


ESTs; WeaWy similar to RNA heScase HDB/DICE1 [H.sapiensj 


0.095 


102687 


U73379 


Hs.93002 


ubtquffin carrier protein E2-C 


00)95 


128888 


AA034851 


Hs.106893 


ESTs 


0.095 


100283 


D43642 


H^2430 


transcription factor-like 1 


00)95 


102747 


U79303 


H&82482 


protein predicted by clone 23882 


00)95 


107798 


AA019346 


HS.6091B 


EST 


0.095 


123565 


AA608907 


Hs.112614 


EST 


00)95 


116010 


AA449450 


Hs.56421 


ESTs; Weakly similar to Similarity to HJnfluenza ribonuclease PH [Ceiegans] 


0.095 


117155 


H97536 


Hs.42391 


EST 


00)95 


133094 


AA115572 


Hs.64746 


chloride intracellular channel 3 


0.095 


113174 


T54659 


Hs.9779 


ESTs 


0.095 


102016 


U0327O 


Hs.122511 


centrin; EF-hand protein; 1 


0.095 


130126 


AB002318 


HS.150443 


K1AA0320 protein 


0.095 


134813 


X14767 


Hs.89768 


gamma-amlnobutyric acid (GABA) A receptor; beta 1 


0.095 


132055 


N69440 


Hs.38132 


ESTs 


0.095 


122229 


AA436198 


Hs.103902 


ESTs 


0.096 


127574 


AA907314 


Hs.188905 


ESTs 


0.096 


134432 


AA053022 


K&8312 


ESTs 


0.096 


128052 


AA878398 


Hs.190491 


ESTs 


0.096 


101637 


M5B285 


Hs.132834 


hematopoietic protein 1 


0.096 


103386 


X92972 


Hs.80324 


protein phosphatase 6; catalytic subunit 


0.096 


133079 


AA477561 


Hs.6449 


ESTs 


0.096 


120328 


AA1 96979 


Hs.104129 


ESTs; Weakly similar to protease [H^aplens] 


0.096 
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107640 AA009615 H&257808 ESTs 0.096 

123389 AA521176 H&221231 ESTs 0.096 

103222 • X74795 Hs.77171 mtnichromosome maintenance deficient (S. cerevisiae) 5 (cell division cycle 46) 0.096 

111704 R22450 Hs23396 ESTs; Highly similar to ZINC FINGER PROTEIN 140 [Rsapisns] 0.096 

5 126656 AA306523 EST! 77475 Jurkat T-cells VI Homo sapiens cONA 5* end, mRNA sequence. 0.733 

127071 AA250806 ESTs 0.096 

114550 AA056755 Hs.151714 ESTs 0.096 

125955 AI356943 Hs.143761 ESTs 0.096 

134363 M37033 HsX2212 CD53 antigen 0.096 

10 128550 W76492 Hs.170142 ESTs 0.096 

122598 AA453465 HsX9329 ESTs 0.096 

118898 N90703 Hs.4236 KIAA0478 gene product 0.096 

117661 mm Hs.44940 ESTs 0.096 

120996 AA398281 Hs.1 43684 ESTs 0.096 

15 123388 AA521172 Hs.134417 ESTs 0.096 

106700 AA463929 Hs^8701 ESTs 0.096 

112962 T16814 Hs.6828 ESTs 0.096 

121262 AA401372 Hs.97723 ESTs 0.096 

134551 R44839 Hs.8526 . hbeta-1 ;3^^C8tyteiucosaminyfiransfenise 0.096 

20 112060 R43754 H&21164 ESTs 0.096 

134678 AA033935 Hs.182595 dynein;axonemal;Dgh1potypepfide4 0.096 

100855 HG4234+TT4504 Methytenetetrahydrofolate Reductase 0.097 

132414 N91193 H&48145 ESTs * 0.097 

112900 T08758 Hs3813 ESTs 0X97 

25 115989 AA447777 Hs.93135 ESTs 0.097 

103561 221488 Hs.143434 oontactin 1 0.097 

131087 AA009738 Hs^2824 ESTs; Weakly similar to p160 myb-blnding protein [M.musculus] 0.097 

120293 AA1 90859 Hs.191428. ESTs 0.097 

111830 R36081 Hs.25085 EST 0.097 

30 113654 T95770 Hs.17666 ESTs 0.097 

132675 AA179338 H&5476 serine proteinase inhibitor 0.097 

120182 Z40125 Hs.91968 ESTs 0.097 

132879 U16282 HsJ881 ELL gene (1M9 rysirKHich leukemia gene) 0.097 

134211 AA056681 HsX0021 ESTs; Weakly similar to 6209.p pjnelanogaster] 0.097 

35 115448 AA284845 Hs.165051 ESTs 0.097 

118118 N56901 Hs.47995 ESTs 0.097 

107598 AA004528 H&169444 ESTs 0.097 

128933 H01824 Hs.760 GATA-bindlng protein 2 0.097 

114892 AA235988 H&B6Q24 ESTs 0.097 

40 101922 S75168 H&274 megaka^cyte-associated tyrosine kinase 0X97 

105444 AA252374 Hs.19333 ESTs; Weakly simitar to ATP(GTP)-binding protein [H^apiens] 0.097 

128155 AA926843 Hs.143302 ESTs 0.097 

116276 AA485870 Hs.44914 ESTs 0.097 

111964 R41227 H&21860 ESTs 0.097 

45 135100 AA398926 Hs251108 Homo sapiens mRNA; chromosome 1 specific transcript K1AA0493 0.097 

124872 R69251 Hs.101506 EST 0.097 

103084 X59932 Hs.77793 c-src tyrosine kinase 0X97 

124138 H23199 Hs.107010 ESTs 0.098 

130048 R31745 H&211612 SEC24 (S. cerevisiae) related gene family; member A 0.098 

50 100208 D26129 Hs.78224 rtauciease; RNase A femliy; 1 (pancreatic) 0X98 

123537 AA608775 Hs.112589 ESTs 0.098 

116999 N95019 Hs.55092 ESTs 0X98 

118847 W80384 Hs.9853 ESTs - 0X98 

112819 R98618 H&35984 ESTs 0.098 

55 131080 J05008 H&2271 endothe&i 1 0.098 

127353 AA190853 Hs.155360 ESTs 0.098 

132068 X66365 Hs.38481 cydnvdependent kinase 6 0.098 

105744 AA293436 Hs.12909 ESTs 0.098 

133680 M92357 Hs.1 01 382 tumor necrosis tactor; alpha-induced protein 2 0X98 

60 122899 AA469960 Hs.178420 ESTs; Highly similar to WASP interacting protein [H.sapiens] 0.098 

128700 U59286 Hs.103982 small inducible cytokine subfamBy B (Cys-X-Cys); member 1 1 0.098 

104393 H46486 H&226499 nesca protein 0.098 

123320 AA406792 Hs.139572 EST 0.098 

129169 N31641 Hs.109058 ribosomal protein S6 kinase; 90kD; pofypepticte 5 0.098 

65 135093 U51333 Hs.159237 hexokinase 3 (while cell) 0.098 

113269 T65159 H&85044 ESTs 0.098 

124283 H86783 Hs.1 94136 ESTs; Moderately similar to zinc finger protein RIN ZF [Rjiorvegicus] 0.098 

114376 GMCSF Accession not listed in Qenbank 0.099 

100881 HG4458-HT4727 Immunoglobulin Heavy Chain, Vdjc Regions (Gbl23563) 0X99 
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116572 


D45654 


Hs.65582 


DKFZP586C1324 protein 


0.099 


123956 


AA621747 


Hs.1 12847 


EST 


0.099 


100818 


HG4018-HT4288 




OpiokJ-Btndlng Cell Adhesion Molecule 


0.099 


132754 


W47419 


H&56007 


Human DNA from chromosome 19-specffic cosmid F25965; genomic sequence 


O099 
0.099 


112741 


R93080 


H&35035 


ESTs 


112748 


R93299 


Ks.166492 


ESTs 


0.099 


130858 


SS7235 


HS246381 


CD68 antigen 


0«99 


124870 


R69233 


Hs.101504 


ESTs 


0.099 


125304 


239833 


Hs.124940 


GTP-binding protein 


0.099 


121287 


AA401995 


Hs.97860 


ESTs 


0.099 
0.099 


128602 


AA046103 


Hs.102367 


ESTs 


124062 


H0O44O 


Hs.144524 


ESTs; Weakly similar to signal transducer and activator of 


0.099 








transcription 2 [MjihiscuIus] 


100547 


HG2149-HT2219 




Mucin (Gb:M57417) 


0.099 


105652 


AA282505 


Hs.19015 


ESTs 


0.099 
0.099 


133390 


AA459945 


Hs.72660 


WAA0585 protein 


133503 


M33195 


Hs.743 


Fc fragment of IgE; high affinity I; receptor for; gamma polypeptide 


0X99 


109461 


AA232667 


H&58210 


ESTs 


0.099 


102068 


U09117 


Hs£0776 


phosphoOpase C; delta 1 


0.099 


113464 


T86931 


Hs.16295 


" ESTs 


0.099 


104240 . 


AB002368 


Hs.70500 


WAA0370 protein 


0X99 
0.1 


121113 
122896 


AA399109 


Hs.161813 


ESTs 


AA469952 


Hs.97899 


ESTs; Weakly simtor to daE; !en:343; CA1: 0.17f ALCJTEAST P25335 


0.1 








ALLANTOICASE [S^erevisiae] 


102405 


U43146 


Hs.159526 


patched (Drosophila) homotog 


0.1 


103599 


Z33905 


HSJ81218 


receptor-associated protein of the synapse; 43kO 


0.1 


121079 


AA398719 


H&14169 


ESTs; Weakly similar to CREB-binding protein [H^apiens] 


0.1 


115820 


AA427487 


Hs39619 


ESTs; Weakly similar to RETICULOCALBIN 1 PRECURSOR [KsapiensJ 


0.781 


125106 


T95766 


Hs.189760 


ESTs 


0.1 


131373 


N68116 


HS56146 


Down syndroms critical region gene 3 


0.1 


120224 


Z41239 


Hs.106960 


ESTs 


0.1 


133090 


AA448228 


Hs.6468 


ESTs 


0.1 


132300 


AA133244 


Hs.44234 


ESTs 


0.1 


113129 


T49384 


Hs*988 


EST 


0.1 


110638 


H73197 


Hs.17241 


ESTs 


0.1 


131364 


R53255 


HS26010 


ESTs 


0.1 


105370 


AA236476 


H&22791 


ESTs; Weakly similar to transmembrane protein with EGF-fike and two 
foflistatin-iike domains 1 [H .sapiens] 


0238 
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TABLE 1 1 A shows the accession numbers for those primekeys lacking unigenelD's for 
Table 11. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from 
Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity 
using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank 
accession numbers for sequences comprising each cluster are listed in the "Accession" 
column. 



10 



15 



20 



25 



30 



35 



40 



Ptey: 

CAT number 



55 



60 



65 



Unique Eos probeset identifier number 
Gene duster number 
Genbank accession numbers 



Ptey CAT number 
100610 19864J 



100674 21517.2 



108559 41469J 

100721 19818J 

100748 41861J 

100750 15759J 



45 100751 24700J 



50 



100760 1334.7 
100775 18179.3 



AW161357 AI879062 AI928938 AW161097 AW161 167 BE314465 AA351715 F0709a AA178034 F08510 F0Q653 AI936671 
AM76718 AW772454 AI807703 R44253 AA976667 AE9B5186 AJ650254 K38942 R84829 AA018724 AA001000 H85934 
AA019126 H85609 AA017000 AA339355 AW950556 051397 AA213981 BE548002 AI056359 AA001560 AW9521 13 
AA317769 A1857477 AI857475 AW249771 AW162661 H38943 AA018628 R85885 AI984613 AI934765 AT796172 AW157488 
A1929191 R85523 051221 D53851 H85610 AI749674 F215B2 AA323145 AA019127 AA687444 T06745 AI699293 H29532 
AA214Q29 AA223656 NM_0 16834 X 14474 R19697 H09695 R17455 R13812 R19056 AI681231 AJ590200 R37671 AA861828 
AI890023 AI935669 AW005821 AA324581 H17335 R37659 R42802 R46242 RB0936 R59731 H28993 AA479907 R44570 
AI890696 AA308884 AA507078 R41274 AI365507T16348 A1560453 F03259 F04722 T16312 AA016081 AW073061 
BE314824 W28930 R44098 R51045 

AW403342 AW248986 BE561709 AA357312 BE311834 BE389496 BE294887 AW732696 BE047668 AT702383 BE019155 
AI702367 BE408966 BE280458 BE313759 BE513492 BE535404 BE28Q258 AC005263 NM.007165 L21990 AW732711 
AI564920 AW249094 BE265365 AW607186 AW607346 BE005217 H2721 1 U46230 BE260066 BE207043 BE546782 
AW248659 
AA085228AA085161 

L40904 NM.005037 X90563 AB005526 H2159B AA088517 



BE157260 BE157265 R481 1 8 H43827 Z17877 AW379070 AW291778 M20605 J03253 M 14206 V00568 AI860465 AW296022 
M13930 AL047400 J00120 BE018476 AW675223 T26980 F06694 R22709 R24720 H22753 Al 9031 00 AI903094 AW937823 
X00364 D10493 K01904 K01S06 K00535 100058 AA410662 AW384760 AA3O4930 A1680985 X00198 H58025 AW998901 
AV653447 N31654 AW610357 AW610369 AW862480 BE223010 AW384172 AW384219 AW384171 AW38421 8 AA298522 
BE140421 AW945162 AW75171 1 AA514409 AW747912 AI214214 WB7741 AAS72406 AA554513 BE302087 AI249030 
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TABLE 12: shows genes, including expression sequence tags, that are down-regulated in 
prostate tumor tissue compared to normal prostate tissue as analyzed using Affymetrix/Eos 
HuOl GeneChip array. Shown are the ratios of "average" normal prostate to "average" 



5 prostate cancer tissues. 



10 UnigenelD: Unlgene number 

Unigene Title: Unigene gene title 

R1: Background subtracted norma) prostate : prostate tumor tissue 

15 Pkey ExAccn UnigenelD Unlgene Title R1 

100522 HG1763-HT1780 Proiactirhtnduced Protein £ 17.4 

130803 M81650 Hs.1968 semenogetin I 16.765 

118068 N53943 Hs.13743 ESTs 13225 

20 114251 Z39898 Hs21948 ESTs 12.7 

112134 R46025 Hs.7413 ESTs 8.735 

101436 M20542 Hs.158295 Human alkali myosin tight chain 3 mRNA; complete cds 8.175 

104028 AA361094 Hs221128 ESTs 8.15 

108944 AA149204 Hs.175783 ESTs; Highly similar to growth arrest inducible gene product [H^apfens] 7535 

25 103838 AA174173 Hs.12622 ESTs 7212 

120469 AA251741 H&25882 DKFZP586M1 824 protein 7.175 

110279 H28231 Hs.27384 ESTs 6.701 

127472 AA761378 Hs.192013 ESTs 6X42 

133301 N35229 Hs.7037 paJCd (mouse) homolog; patiidin 6411 

30 102457 U48807 Hs2359 dual specificity phosphatase 4 6595 

114011 W90385 Hs.15082 ESTs 6.15 

101249 L33881 Hs.1904 protein kinase C; iota 6 

123265 AA491209 Hs.1 05265 ESTs; Weakty similar to reverse transcriptase [M jrtusculus] 6 

119322 T4S655 Hs241569 ESTs; Modly smlr to !! ALU SUBFAMILY SQ WARNING ENTRY H [H^apiens] 5X5 

35 101673 M61906 H&6241 pnosphoinositide^dnase; regulatDry subunit; polypeptide 1 (p85 alpha) 5J925 

115586 AA399218 HSX2423 ESTs 5.7 

120590 AA281760 Hs.111441 ESTs; Weakly similar to similar to KmppeHike zinc finger protein [Celegans] 5.7 

109748 F10192 Hs248323 Tubulin; alpha; brain-specific 5.625 

134727 X80507 H&8939 yes-associated protein 65 kDa 55 

40 129171 AA234048 Hs.7753 calumenin 5-486 

120390 AA233122 Hs.1 11460 ESTs; Highly simflar to muBifunctional caiaWcalmodulm-dependent protein 

kinase II delta2 isofbrm [hUaplens] 5A 

131699 R68657 H&90421 ESTs; Modly smlr to U ALU SUBFAMILY SX WARNING ENTRY U [H^apiens] 5279 

104490 N71503 Hs.43087 ESTs; Weakly similar to dysfertin [Rsapiens] 5266 .. 

45 102124 U14528 H&29981 solute carrier family 26 (sulfate transporter); member 2 5.151 

109280 AA196635 H&86081 ESTs 5.134 

109707 F09739 Hs.1 85701 Homo sapiens mRNA full length insert cDNA clone BJROIMAGE 21920 5X75 

108087 AA045709 Hs. 40545 ESTs 5X75 

135006 M21665 HsX29 myosin; heavy polypeptide 7; cardiac muscle; beta 5.055 

50 119182 R80664 Hs.77067 ESTs - 5.033 

129806 R62444 Hs.173373 KIAA0931 protein 4.675 

101435 M20543 Hs.1 288 actin; alpha 1 ; skeletal muscle 4.626 
125954 R93943 yt72c1 2/1 Soares retina N2b4HR Homo sapiens cDNA clone IMAGE275735 5', 4.6 

113989 W87544 Hs221184 ESTs 4559 

55 104432 J03460 HsX9949 prolactin-induced protein 4.451 

112326 R56068 Hs.4268 ESTs 4.45 

119063 R16833 Hs.53106 ESTs; Weakly similar to 11 ALU SUBFAMILY J WARNING ENTRY 11 [H.sapiens] 4.45 

130376 R40873 Hs.1 55 174 KIAA0432 gene product 4201 

122484 AA448286 Hs.98074 ESTs; Highly similar to atrophin-1 interacting protein 4 [Haptens] 42 

60 104142 AA447006 ESTs; Moderately similar to II ALU SUBFAMILY SQ WARNING 4.175 

129413 N32787 Hs.1 1123 ESTs; Moderately similar to hypothetical protein 2 [H^apiens] 4.1 
103678 Z84483 Human DNA sequence from PAG 46H23, BRCA2 gene region chromosome 13q12-1 34.05 

114266 Z40186 H&26409 ESTs 4.05 

115206 AA262491 Hs.186572 ESTs 4X48 

65 123723 AA609749 Hs.1 12759 ESTs; Higtily similar to unknown protein [Rjtorveglcus] 4.041 

129130 H97993 Hs.172788 ESTs; Wealaysirrtiar to WAAQ512 prated ^ 4.028 
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HS.72100 toTS 


q coo 


109585 


F02367 


HS27252 col 8 


OJO 


115134 


AA257107 


Hs.194331 ESTs 


35 


116083 


AA455653 


Hs.44581 ESTs; Weakly similar to HEAT SHOCK 70 KD PROTEIN 6 [H^apiens] 


3.459 


120524 


AA261852 


Hs.192905 ESTs 


345 


116932 


H74330 


Hs. 150000 ESTs 




130746 


AA256976 


Hs.18800 ESTs; Weakly similar to KIAA0579 protein [H^aplans] 


3.42 


107513 


X05451 


Hs. 158295 Human aikaB myosin light chain 3 mRNA; complete cds 


3417 


118641 


N702S8 


Hs.49829 ESTs 


3407 


126584 


A1028384 


Hs.127331 ESTs 


3599 


105134 


AA159953 


Hs.22895 ESTs; Weakly similar to aryJsulfatase B precursor [H.sap3ens] 


3525 


123502 


AA600116 


Hs.1 12526 ESTs 


3518 


138389 


N50866 


Hs.47135 ESTs 


3517 


105691 


AA287097 


Hs.75356 transcription factor 4 


3515 


131505 


H85897 


Hs.27755 ESTs 


3509 


120775 


AA342104 


Hs.96777 EST 


35 


105579 


AA278824 


H&19218 ESTs 


3295 


128190 


AA946876 


Hs.148376 ESTs 


3592 


100819 


HG4020-HT4290 Transglutaminase 


3588 


130217 


029956 


Hs.152818 ufaiquifin specrflc protease 8 


3573 


130068 


AA608903 


Hs.1 06220 K1AA0336 gene product 


3569 
3566 


134719 


L07515 


H&89232 chromobox homotog 5 {Drosophila HP1 alpha) 


110277 


H29209 


Hs.151231 ESTs; Highly similar to FYVE finger-containing phosphoinositide kinase [Mjnusculusj 326 


127354 


AM1B880 


Hs.185797 ESTs 


3512 


129173 


R60523 


Hs.109087 ESTS 


3.197 


127464 


AA970504 


Hs.146103 ESTs 


&179 


124923 


R94500 


Hs.1 08046 ESTs 


3.175 


122465 


AA448164 


Hs.99153 ESTs; Highly similar to CGI-73 protein [H^apiens] 


3.151 


122027 


AA431302 


H&98721 EST; Weakly similar to N-ccpine [H sapiens] 


3.151 


103329 


X85134 


Hs,72984 retinobbstoma-blndlng protein 5 


3.15 


129937 


M95767 


Hs.135578 chitoblase;dH^acetyl- 


3.15 


134197 


AA057341 


Hs.878B9 hel'icase-mol 


3.15 


107764 


AA018219 


Hs.226923 ESTs 


3.125 


121775 


AA421773 


Hs.161008 ESTs 


3.125 


114768 


AA149007 


Hs.182339 Ets homologous factor 


3.12 


132381 


N48818 


H&46884 ESTs 


3.11 


123105 


AA485973 


Hs.143947 ESTs 


3.104 


121176 


AM00080 


Hs.97774 ESTs 


3.1 


125053 


T80620 


Hs,186473 ESTs 


3.075 


105909 


AA401739 


HS5111 ESTs 


3.066 
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55 



60 



65 



119767 
115776 
111713 
115301 
118448 
106586 
110415 
105173 
101102 
110543 
125593 
100824 
106822 
131963 
111221 
113620 
105220 
123234 
125250 
'116196 
122100 
111712 
126569 
111132 
115307 



W72562 
AA424038 



AA280047 

N66412 

AA456598 

H48239 

AA182030 

L07594 

H58383 

R24464 



D11930 



129486 
119805 
125721 
103704 
128420 
120571 
.123059 
129462 
125166 
125992 
109431 
105077 
131388 
121080 
112575 
130244 



116355 
115316 
129677 
130971 
115054 
130285 
124308 



114800 
128625 
130159 
107127 
113547 
104639 
127609 



117634 



101609 M54927 



117142 
112602 
106828 
124377 



Hai8119 ESTs &Q57 

HS38197 ESTs 3.056 

H&220950 ESTs 3.05 

Hs.43948 ESTs 3.05 

Hs.49189 ESTs 3 

HS256269 ESTs 2.995 

Hs29739 ESTs; Weakty similar to RAS-RELATED PROTEIN RAB-3A [Ksaplens] 2379 

Hs3364 ESTs 2378 

Hs.79059 transforming growth factor; beta receptor III (betaglycan; 300kD) 2.976 

Hs.258544 ESTs 2.976 

H&202949 KIAA1 102 protein 2.964 

HG4058-HT4328 Oncogene Aml1-Evi-1, Fusion Activated 2.957 

AA481068 H&31835 ESTs 2.95 

Hs3592 ESTs 2.95 

Hs.15119 ESTs 2336 

Hs.17252 EST 2317 

Hs.17212 ESTs 2317 

Hs.106252 ESTs 2304 

H&222926 ESTs; Weakly similar to D2092.2 [Ceiegans] 23 

Hs.63386 ESTs 23 

Hs.41086 ESTs; Weakly simflar to OXYSTEROL-BINDING PROTEIN [H^apiensl 2396 

Hs.113716 ESTs 2395 

Hs.187698 ESTs;WeaWysirnaartoYer140wpIS.cerevisfee] 2395 

Hs.13149 ESTs; Highly similar to unknown function [H^apiens] 2394 

Hs.191346 ESTs 2386 

Hs.16827 KIAA0849 protein 2383 

Hs.220689 Ras-GTPase*ctivating protein SK3^omairvbinding protein 2379 

Hs.43213 ESTs 2375 

Hs.7503 ESTs 2371 

Ks.153688 ESTs 2368 

Ks.14146 ESTs; Weakly similar to unknown [Rsaplens] 2366 

Hs.128679 ESTs 2363 

Hs^38202 EST 236 

Hs.11 1732 IgG Fc binding protein 2356 

Hs. 172609 nudeobindinl 2354 

za36e07,rt Scares fetal Over spleen 1NFLS Homo sapiens cONA done 2352 

Hs.43635 ESTs 235 

Hs3558 ESTs 2347 

Hs32200 K1AA0480 gene product 2346 

Hs.177953 ESTs 2338 

Hs.17385 ESTs 2336 

Hs.153293 KIAA0701 protein 2325 

Hs.77910 3^ydroJcy-3-fnathy{gtytaryKk>€nzyrra Asynthas8 1 (soluble) 2316 

HS38650 ESTs 2313 

H&57846 ESTs 2306 

Hs.198891 senna/threofune-protein kinase PRP4 homobg 23 

Hs28707 signal sequence receptor; gamma (transtocon-assodated protein gamma) 2799 

Hs.87729 ESTs 2.795 

H&202968 ESTs 2.792 

Hs.227146 Homo sapiens mRNA; cONA DKFZp564J142 (from done DKFZp564J142) 2.7B3 

Hs.191959 ESTs 2.778 

Hs.131887 ESTs; Weakly similar to ORF YNL227C [S.cerevisiae] 2.768 

Hs.102652 ESTs; WeaJdy similar to KIAAC^ - 2.766 

Hs.151310 PDZ domain protein (DrosophQa inaDHtke) 2.75 

Hs£2119 ESTs 2.742 

Hs.15233 ESTs 2.734 

Hs.18214 ESTs 2.727 

Hs.150318 ESTs 2.726 

Hs.10056 ESTs 2.725 

yg85c3.s1 Soares infant brain 1NIB Homo sapiens cDNA clone 2.725 

Hs.154054 ESTs 2.708 
Hs.107854 ESTs; Weakly slmBarto SODIUM- AND CHIORIDE-DEPENDENTGLYCINE 

TRANSP 2.706 
Hs.1787 proleolipid protein 1 (Pelizaeus-Merzbacher disease; spastic paraplegia 2; 

uncomplicated) -2.704 

Hs.42251 ESTs 2.7 

Hs.203365 ESTs 2395 

Hs.13797 ESTs 2.68 

Hs.179833 ESTs 2375 
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W45491 

W01626 

AA227972 

AA142919 

R34531 

AA398720 

R73816 

R26206 

AA427783 

AA504356 

AA260627 

U48736 

H20332 
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AA063546 

H93575 

AA732329 
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AA242816 
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AA620504 

T90746 

AA004622 
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AA490964 
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101026 J04970 carboxypeptidase M 2.675 

124560 N66393 * Hs.102754 ESTs 2575 

124066 H02494 Hs.101615 ESTs 2.671 

130281 R12777. Hs,15395 ESTs; Weakly similar to ARGINYL-TRNA SYNTHETASE (H^apiens] 2.66 

110949 N49602 Hs.13308 ESTs 255 

111031 N54839 Hs221085 ESTs; Highly similar to mediator [Rsapiens] 2.633 

121770 AA421714 Hs.1 1469 KIAAD89B protein 2.63 

134132 U32519 H&220689 Ras-GTPase-activating protein SH3Hiomain-bmtling protein 2.626 

112424 R62452 Hs.191265 ESTs 2.625 

122544 AA451679 Hs, 194410 ESTs 2.625 

134425 X90568 Hs.1 72 004 titin 2.624 

111114 N63391 Hs.9238 ESTs 2.619 

116119 AA459242 Hs.44445 ESTs; Weakly similar to Ketch motif containing protein {H^aplens] 2.615 

112079 R44164 H&23014 ESTs 2.6 

123033 AA481271 Hs.193945 ESTs 2591 

124196 H52617 Hs.144167 ESTs £586 

125873 H14437 yt25a04.r1 Scares breast 3NbHBst Homo sapiens cONA done 258 

117684 N40184 Hs.45050 ESTs 2575 

134938 D30037 Hs.168326 phosphotldyljnosito) transfer protein; beta 2575 

131822 AA215647 Hs.200332 ESTs 2568 

135185 U71203 Hs.96038 Ric (DrosophllaHike; expressed in many tissues 2564 

117690 N40467' Hs.93834 ESTs . 2557 

118807 N78582 Hs50732 protein kinase; AMP-«ctivated; beta 2 non-catelyBD subuntt 2552 

121369 AA405657 Hs.128791 Human DNA sequence from clone 967N21 on chromosome 20p12^-13. Contains 255 

1 14860 AA2351 12 Hs.106227 ESTs; Moderately similar to similar to murine RNA-binding protein [H^apiens] 2549 

121857 AA426017 Hs.62694 ESTs; Highly similar to DNA-REPAJR PROTEIN COMPLEMENTING 2548 

110190 H20560 Hs£44624 ESTs 2548 

132573 AA045333 Hs51743 ESTs; Weakly similar to !! ALU SUBFAMILY SB2 WARNING ENTRY 11 [H^apiens] 2.542 

109706 F09729 Hs. 12780 ESTs 2537 

135109 AA410391 Hs.94592 ktotho 2525 

132810 R37027 Hs5737 KIAA0475 gene product 2525 

124879 R73588 Hs.101533 ESTs 2525 

103840 AA174190 Hs50932 ESTs 2525 

119066 R22196 Hs.34492 ESTs 2519 

114833 AA234362 Hs.87310 ESTs; Moderately similar to CGH56 protein [H^apiens] 2507 

112998 T23555 Hs.103288 ESTs 25 

123312 AA496258 Hs.99601 ESTs 2.499 

121873 AA426270 Hs.145696 splicing factor (CC1.3) 2.491 

123321 AA496884 H&23972 ESTs 2.491 

107760 AA018042 Hs.95078 EST 2.483 

102580 U60808 Hs.152981 CDP-diacylglyceroj synthase (phosphaiidale cyBdylytosfeiase) 1 Z481 

103053 X56741 Hs5947 met transfonning oncogene (derived from ceD One NK14)-RAB8homolog 2.475 

124756 R38100 Hs.106294 ESTs 2.475 

112936 T15665 H&6185 ESTs; Weakly similar to BcDNA.GH12174 p jnelanogaster] 2475 

125178 W58202 Hs.125731 ESTs 2.475 

112423 R62447 H&22123 ESTs 2.471 

123515 AA600323 Hs.1 12535 EST 2.482 

102842 U95Q20 Hs.21803 calcium channel; voltage-dependent; beta 4 sutwnit 2.457 

102400 U42390 Hs.171957 triple functional domain (PTPRF Interacting) 2.455 

113187 T56056 Hs.9992 ESTs 2.452 

131687 L11066 Hs.3069 heat shock70kD protein 9B (mortal) 2.448 

115314 AA280583 H&256501 ESTs 2.437 

128211 AI206427 Hs.166707 ESTs; Highly similar to Ran-binding protein 2 [H^apiens] 2.43 

134281 L11005 Hs51047 aldehyde oxidase 1 2.425 

115985 AA447709 Hs.132094 ESTs; Moderately similar to putative transcription factor CA150 [H.sapiens] 2.425 

111348 N90041 Hs.9585 ESTs 2.418 

129430 AA256842 Hs.1 97877 Homo sapiBns clone 23777 putafive transmembrane GTPase mRNA; partial cds 2416 

133863 C13990 Hs.76930 synuclein; alpha (non A4 component of amyloid precursor) 2417 

111164 N66857 Hs.14808 ESTs; Weakly similar to U ALU CLASS C WARNING ENTRY !! [H.sapiens] 2.416 

132143 AA257056 Hs.7972 KIAA0871 protein 2.412 

130330 M55047 Hs.154679 synaptotagmin 1 2.408 

114219 Z39451 Hs*7389 ESTs 2406 

117101 H94043 Hs.24341 DKFZP5B61 141 9 protein 2.403 

125433 AA034325 Hs54320 ESTs 24 

111099 N62506 Hs.21956 ESTs 2.4 

120323 AA195405 Hs.1 10347 Homo sapiens mRNA for alpha integrin binding protein 80; partial 2-397 

118624 N69996 H&21801 ESTs 2.394 

123570 AA608955 Hs. 109653 ESTs 2389 

123562 AA608893 Hs.190065 ESTs 2388 
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131546 AA262821 Hs.28578 musdebimd prosophDa>fike 2385 

103143 X66141 H&75535 myosin; Bght polypeptide 2; regulatory; cardiac; stow 2384 

123645 AA609310 Hs.188691 ESTs 2383 

130123 AA001835 Hs.150390 zinc ringer protein 262 2379 

131682 AA428368 Hs.30654 ESTs 2378 

115909 AA436666 H&59761 ESTs 2375 

125168 W45574 Hs352497 ESTs 2372 

123973 C14805 Hs.182151 ESTs 2361 

135197 U76456 Homo sapiens tissue inhibitor of metaitoprotainase 4 mRNA, complete cds 2357 

118689 N71545 Hs.184544 ESTs 2357 

107734 AA016225 Hs.93386 ESTs 2354 

124590 N69220 H&41381 ESTsWeaJdys^rtoiibkjuitmh^^ 235 

111163 N66850 Hs. 17606 ESTs 2348 

112349 R58877 H&22665 ESTs; Moderately similar to dJ83L6.1 [Rsapiens] 2345 

129076 AA262179 Hs.1 69343 ESTs 2345 

134238 R81509 K&184571 splicing factor; argmina/serine-rich 11 2341 

116766 H13260 Hs35097 ESTs 2.336 

106331 AA436853 HS34795 ESTs 2.333 

129003 AA443752 Hs.10784 ESTs 2332 

132368 AA599814 Hs.46637 ESTs; Weakly similar to cDNA EST yk289g5.5 comas from this gene [C.elegans] 2332 

124697 R06273 Hs.1 86467 ESTs; Modiy smir to II ALU SUBFAMILY J WARNING ENTRY !J [H^apiens] 2322 

120273 AA176688 HS-221139 ESTs 2313 

127110 AA304993 Hs.100861 ESTs; Weakly simHar to p60 katanin [H^apiens] * 2307 

105450 AA252621 Hs.93842 ESTs 2301 

119819 W74371 H&56383 ESTs 2397 

102302 U33052 Hs.69171 protein kinase C-Hte 2 2388 

130596 N74353 Hs. 16475 ESTs 2382 

114161 Z38904 Hs323B5 ESTs; Weakly sim3ar to KIAA0970 protein [H^apiens] 2378 

130542 U64675 Human sperm membrane protein BS-63 mRNA, complete cds 2377 

104491 N71513 Hs.39328 ESTs 2375 

116988 H82527 ys69e12.s1 Scares retina N2b4HR Homo sapiens cDNA clone 2375 

126823 AA370120 Hs.7870 ESTs; Weakly similar to Yb350wp [Sxerevisiae] 2373 

108800 AA129731 Hs.80424 ESTs 2373 

101310 U1607 Hs.934 glucosamine (N-acetyl) transferase 2; 1-branching enzyme 2369 

126842 W19498 Hs31085 ESTs 2355 

127251 AA936428 Hs.128638 ESTs 2351 

124647 N91947 Hs.125033 ESTs 2349 

127112 AI143906 Hs.125103 ESTs 2347 

101973 S82597 Hs30120 UDP^cety^aJpfe-D^ctosamine^iypepti'de 2346 

120999 AA398302 Hs.127437 ESTs 2345 

130225 AA599583 Hs. 15299 HMSA-indurib!e 2343 

119980 W88678 Hs349247 heterogeneous nuclear protein similar to rat heDx destabilizing protein 2343 

124222 H61053 Hs322844 ESTs 234 

129199 H 90914 Hs.1 28629 ESTs 2336 

106802 AA479101 Hs.1 6570 ESTs; Weakly similar to I! ALU SUBFAMILY SQ WARNING ENTRY H [Haptens] 2331 

126160 N90960 Hs347Z77 ESTs; Weakly similar to transformation-related protein [H^apiens] 2329 

104627 AA001976 Hs.1 9603 ESTs 2328 

106474 AA450212 H&42484 Homo sapiens mRNA; cDNA DKFZp564C053 (irom done DKFZp564C053) 2326 

113096 T40927 Hs3345 ESTs 2325 

135336 AA452822 Hs39027 ESTs 2325 

135344 R62976 Hs.1 68491 ESTs; Moderately simHar to TRF1 -interacting ankyrin-reiated 2325 

126156 AA508354 Hs.1 18448 ESTs; Moderately simflar to AKT3 protein kinase [H^apiens] 2322 

128885 AA397841 Hs.1 80141 coffiin 2 (muscle) - 2318 

107900 AA026385 Hs.1 76600 ESTs; Moderately similar to 1! ALU SUBFAMILY SB2 WARNING 2317 

114481 AA033562 Hs.151572 ESTs 2312 

109292 AA199828 Hs.188662 ESTs 2312 

104257 AF006265 Hs.9222 estrogen receptor-binding fragment-associated gene 9 2309 

132932 T15482 Hs.6093 ESTs 2304 

127392 AA262728 Hs. 14896 Homo sapiens done 24590 mRNA sequence 2304 

104641 AA004652 Hs.18564 ESTs 23 

122529 AA449828 Hs.99229 ESTs 2.195 

124307 H93562 Hs. 162395 proline synthetase cc-transcribed (bacterial homdog) 2.193 

1330)1 S95936 Hs.75155 transferrin 2.193 

119904 W85709 Hs.128927 ESTs; Weakly similar to U ALU SUBFAMILY SP WARNING ENTRY 1! [H^apiens] 2.192 

100348 064109 Hs.4994 transducer of ERBB2; 2 (TOB2) 2.185 

126871 AA351779 Hs300334 ESTs 2.18 

127793 AI298835 Hs30445 ESTs; Weakly simflar to transcription regulator Staf-50 [H^aptens] 2.178 

105149 AA169253 Hs3958 ESTs 2.177 
121367 AA40S648 zw39gB.s1 Sc*resjotalfetus_Nb2H^ 
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111836 R36228 Hs35119 ESTs 2.175 

133394 R16759 H&237225 ribosomal protein S5 pseudogene 1 2.175 

123207 AA489697 Hs.145053 ESTs 2.175 

129801 F11087 Hs.239666 ESTs 2.175 

5 1(3393 X94612 Hs.41749 protein kinase; cGMP-dependent; type II 2.161 

132415 AA043223 Hs.4815 nudix (nudeoside diphosphate linked moiety X)-type motif 3 2.157 

106369 AA443828 Hs.25324 ESTs 2.157 

122963 AA478446 Hs.69559 KIAA1 096 protein 2.156 

133473 M19309 Hs.73980 troponin T1; skeletal; slow 2.155 

10 134257 C06270 HsJ078 Homo sapiens mRNA; cDNA DKFZp586L081 (from clone DKF2p586L081) 2.155 

135156 AA056012 Hs.9552 binder of Ari Two 2.151 

104055 AA393755 Hs.117211 ESTs; Highly similar to CG1-62 protein [H^apiens] 2.15 

102313 U33921 HSU33921 Ctontech adult lung cONA Iforary (HL1 158a) Homo sapiens cDNA 2.15 

109788 F10638 Hs.12432 Homo sapiens clone 24407 mRNA sequence 2-15 

15 103507 Y10032 Hs.159640 serum/glucocorticoid regulated kinase 2.15 

116000 AA448710 Hs.41327 ESTs 2.15 

10585B AA399164 H&227676 ESTs; Moderately similar to 11 ALU SUBFAMILY SQ 2.137 

103153 X66534 Hs.75295 guanylate cyclase 1; soluble; alpha 3 2.137 

126202 AA652238 Hs.199726 ESTs 2.135 

20 115955 AA446121 Hs.44198 Homo sapiens BAG done RG054D04 from 7q31 2.134 

104164 AA458770 H&27023 K1AA0917 protein 2.132 

108692 AA121270 Hs.82960 ESTs A 2.128 

122878 AA465341 Hs.99640 ESTs 2.126 

134771 L13939 Hs.89576 adaptor-related protein complex 1; beta 1 subunit 2.125 

25 104298 031120 Hs.40368 adaptor-related protein complex 1;sigma 2 subunit 2.125 

104840 AAD39595 Hs.42458 Homo sapiens mRNA; cDNA DKFZp588Cl817 (from clone DKFZp586C1817) 2.125 

122180 AA435798 Hs.98835 ESTs; Moderately similar to putative ring zinc finger protein 2.125 

131012 H01992 Hs£02949 K1AA1 102 protein 2.125 

134092 H17490 Hs.7905 ESTs; Highly similar to sorting nexin 9 [Rsapiens] 2.123 

30 118617 N69666 Hs.183413 ESTs; Modflysmlr to I! ALU SUBFAMILY J WARNING ENTRY !! [H.saptens] 2.123 

107155 AA6212Q2 Hs.7946 DKFZP586D1519 protein 2.12 

130925 N71935 Hs.169378 multipte PDZ domain protein 2.12 

135167 U63717 Hs.95821 osteoclast stimulating factor 1 2.118 

105952 AA405263 Hs.181400 ESTs 2.109 

35 110308 H38148 Hs32775 ESTs 2.108 

116368 AA521186 Ha44217 ESTs 2.107 

132939 U76189 Hs.61152 exostoses (rnultpleHike 2 2.102 

117881 N50073 H&84926 ESTs; HighJy similar to B-IND1 protBin IM.musadus] 2.1 

121723 AA419622 Hs. 104800 ESTs; Weakly similar to Mouse 195 mRNA; complete ods [Mjnusadus] 2.096 

40 103500 Y09443 Hs 22580 eJkylglycerone phosphate synthase 2494 

121429 AA406293 Hs.193498 ESTs 2493 

134632 AA398710 Hs.174139 chtoride channel 3 2.091 

129785 F10980 Hs.184780 ESTs 2-09 

111065 N58193 Hs.18740 ESTs; Weakly similar to 1-evidence 2.089 

45 114710 AA129931 Hs.79081 protein phosphatase 1; catalytic subunrt; gamma isoform 2-083 

132711 N73702 Hs238927 ESTs 2-083 

133377 R05490 Hs.7239 SEC24 (S. cerevisiae) related gene family; member B 1079 

124773 R40923 Hs.106604 ESTs 2478 

117759 N47587 Hs47345 ESTs; Weakly similar to TR0POMODUL1N [H.saplens] 2J076 

50 127386 AI457411 Hs.106728 ESTs 20)76 

101167 L15309 Hs.193677 zinc finger protein 141 (ctonepHZ-44) 2475 

109597 F02582 Hs.14474 ESTs 2.074 

124390 N29325 Hs.7535 ESTs; Highly similar to COBW-like placental protein [Hsapiens] 247 

116225 AA478609 Hs.47278 Human Chromosome 16 BAG dona CIT987SK-A-735G6 247 

55 131243 R16667 H&24752 spectrin SH3 domain binding protein 1 2.069 

130557 T90830 Hs.15981 ESTs; Weakly similar to Bne-1 protein ORP2 [H sapiens] 2.067 

134103 D14826 Hs. 155924 cAMP responsive element modulator 2.064 

108833 AA131866 H&61661 ESTs; Wealdy similar to DY3j6 [Celegans] 2.063 

112286 R53765 Hs.158135 KIAA0981 protein 2.063 

60 125624 AA165411 2q49a01.r1 Stratagene hNT neuron (#937233) Homo sapiens cONA done 2.061 

124612 N72200 Hs.13913 ESTs 2.058 

116335 AA495B30 Hs^7013 ESTs 2.057 

112248 R51361 Hs23423 ESTs 2455 

115789 AA424754 Hs.43149 ESTs 2456 

65 107029 AA599219 Hs.187492 ESTs; Weakly similar to ALR (H^aplens] 2.056 , 

110294 H30270 Hs.165062 ESTs 2.054 

120532 AA262354 Hs. 186648 ESTs 2454 

118180 N59249 Hs.48349 ESTs 2.052 

132018 AA293194 Hs.3737 ESTs 2452 
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132617 AA171913 


Hs.5338 carbonic annydrase XII 


2.05 


131526 N36167 


H&28274 ESTs 


2.05 


113254 T64438 


Hs.1 1449 0KFZP564O123 protein 


2.05 


122785 AA459978 


Hs.99508 ESTs 


105 


107203 D20426 


Hs.5656 EST 


2.05 


105713 AA291321 


Hs.184319 ESTs; Moderately simBar to KIAA1006 protein [H^apiens] 


2.046 


129385 D82675 


Hs.1 10950 Homo sapiens clone 25007 mRNA sequence 


2X42 


119116 R43845 


Hs.64595 DKFZP566E2346 protein 


2.04 


4 -tCilftE A AC/W1CO 

110405 AA60Q253 


U. CCTm Ui/vhk» /■Jrrvl.lr ir\ luint Anil J nn U. A ITJ luininMil 

hsjooO i cSTs, Highly similar to host cell lactor 2 [n^apteos] 


2.04 


125924 AA526849 


Hs.82109 syndecan 1 


2.039 


105599 AA279442 


Hs.143460 protein kinase C; mi 


2X37 


119741 W70205 


Hs.43670 kinesin family member 3A 


2.037 


101449 M21494 


Hs.1 18843 creatine kinase; muscle 


2.036 


107109 AA609943 


Hs.32793 ESTs 


2.034 


117040 H89112 


yw25e5.s1 Morton Fetal Cochlea Homo sapiens cONA done IMAGE25328 


2H34 


132906 AA142857 


Hs.234896 ESTs; Highly similar to gemmm [H^apiens] 


2.031 


105479 AA255546 


H&23467 ESTs 


2.027 


102031 U 04898 


H&2156 RAR-related orphan receptor A 


2X27 


119846 W80363 


U.f AliA PAT. 

H&56446 ESTs 


2.024 


124809 R46482 


Hs.1 06875 ESTs 


2X24 


130266 AA041548 


Hs.154023 K1AAQ573 protein 


2.023 


124457 N50114 


Hs.128704 ESTs 


2.017 


125144 W37999 


H&24336 ESTs 


2.017 


120581 AA281257 


Hs.125868 ESTs 


2.014 


104931 AA062731 


H&1QB319 thyroid hormone receptor-associated protein; 150 kDa subunit 


2.012 


120548 AA278846 


Hs.1 87634 ESTs 


2.011 


113933 W81362 


H&30567 ESTs 


2.011 


123072 AA485041 


Hs.104308 ESTs 


2X09 


123648 AA609323 


Hs.1 12689 ESTs 


2X08 


116875 H67749 


Hs.161022 EST 


2.003 


103179 X69398 


H&62665 C047 antigen (Fib-related antigen; integrin-assoctated signal transducer) 


1.995 


103478 Y07755 


H&38991 S100 calcium-binding protein A2 


1.995 


111007 N53378 


HS22543 ESTs 


1.995 


120470 AA251797 


zs11&s1 NC)_CGAP_QCB1 Homo sapiens cDNA clone 


1.989 


112280 R53457 


HSJ26040 ESTs; Weakly similar to tatty add omega-hydroxyiase [H^apiens] 


1X89 


114127 Z38652 


Hs.106961 ESTs; Weakly similar to TYL [H^aplens] 


1.988 


129863 AA151005 


Hs, 129872 sperm surface protein 


1.988 


106320 AA436608 


ESTs 


1X88 


108933 AA147224 


Hs.71814 ESTs 


1X86 


105906 AA401633 


H&22360 ESTs 


1X82 


109029 AA157911 


Hs.72200 ESTs 


1X82 


118470 N66769 


HSX2781 ESTs 


1X75 


115358 AA281886 


HSX8923 ESTs 


1X75 


115257 AA279060 


Hs.193516 B^BCLLrtyrnphomalO 


1X74 


126879 AA719776 


zh38g04.s1 Soares_pinaa|_gland_N3HPG Homo sapiens cDNA done IMAGEM1 4390 1X74 


109547 F01479 


Hs.26956 ESTs 


1X73 


127111 AA805726 


HS220509 ESTs 


1X69 


101266 L36645 


Hs.73964 EphA4 


1.966 


129319 AA037467 


HsX0340 ESTs 


1.965 


106211 AA428240 


Hs.126083 ESTs 


1.962 


112753 R93696 


Hs.169882 ESTs 


1.961 


120489 AA255538 


Hs.180504 ESTs 


1.959 


129699 AA458578 


Hs.12017 KIAA0439 protein; homolog of yeast ublquHfrhproteln Ggase Rsp5 


1.956 


105425 AA251129 


H&24416 ESTs 


1X53 


134740 L37362 


H&69455 opioid receptor; kappa 1 


1.95 


109324 AA210700 


Hs.86405 Homo sapiens mRNA; cDNA DKFZp564P056 {from dona DKFZp564P056) 


1X5 


124303 H93043 


Hs.107070 ESTs 


1X5 


102337 U36922 


Human fork head domain protein (FKHR) mRNA, 3* end 


1X48 


109441 AA228100 


Hs.86998 nudear lactor of activated T-ceDs 5 


1X46 


127364 AA179573 


HsX0G61 progesterone binding protein 


1X42 


105255 AA227498 


Hs.3623 ESTs 


1X42 


130672 L19783 


Hs.177 phosphatidyCnositol gtycan; class H 


1X42 


104301 045332 


Hs.6783 ESTs 


1X4 


132442 R62589 


Hs.167419 ESTs 


1.939 


105519 AA258063 


HS23438 ESTs 


1.937 


132902 M490969 


Hs.168147 ESTs 


1.936 


118873 NB9881 


Hs.44577 ESTs 


1X36 


114124 Z38595 


Hs.125019 ESTs; Highly similar to KIAA0886 protein [H^apiensj 


1.934 


115075 AA255486 


H&88045 ESTs 


1.933 
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40 



45 
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55 
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65 



110fi95 H93483 
105360 AA236209 
124998 156013 
121816 AA424814 
111717 R23241 
128874 H06245 
109391 AA219699 
126129 H82165 
115553 AA389027 
113811 W44928 
108345 AAO70906 
120472 AA251875 
116602 D80063 
121121 AA399371 
125330 AA401804 
130095 F01831 
119782 W72S82 
104115 AA428090 
131313 C17938 
105583 AA278907 
122825 AA461195 
119495 W35390 
130309 AA134289 
125628 AA418069 
110611 H65947 
117301 N22569 
131406 N92239 
126428 AA013312 
120285 AA182882 
112724 R91753 
103121 X63679 
124381 N26765 
117226 N20468 
105610 AA279991 
111229 N69113 
120627 AA285079 
107048 AA500012 
104041 AA381902 
115162 AA258366 
102239 U26726 
100043 M10098 
120296 AA191353 
129011 S72669 
134651 R44479 
117392 N26175 
114530 AA053Q27 
123541 AA608794 
124890 R78618 
1CB299 AA233511 
103560 Z20S56 
113073 T33637 
120407 AA235040 
103892 AA243523 
123795 AAS20381 
108524 AA084323 
113953 W85812 
110721 H97678 
129426 AA412087 
112102 R44840 
118502 N67317 
107619 AA004955 
100436 D87446 
120652 AA287312 
121643 AA417078 
117387 N26011 
132084 Y12394 
124449 N48593 
120263 AA173440 
127226 AA731036 



H&58262 
H&261Q2 
Hs52370 
HS24549 
Hs59580 
Hs.55533 
Hs.15423 



Hs.124777 ESTs 1531 

Hs.187626 ESTs 1531 

Hs.77910 34iydroxy-3^B{hyig!uta[yt-Coenzyme A synthase 1 (soluble) 1529 

Hs.187509 ESTs 1527 

Hs.110776 STAT induced STAT lnhlbilor-2 1525 

Hs.106801 ESTs 1525 

Hs.184245 KIAA0929 protein Msx2 Interacting nuctear target (MINT) homotog 1513 

Hs.40334 ESTs 1511 

Hs.71414 ESTs 1505 

Hs.4878 ESTs 1505 
zm66d1 .si Stratagene neuroepithelhim (#937231 ) Homo sapiens cDMA clone 1 504 

Hs.104472 ESTs; Weakly similar to Gag-Pol polyprotein [M.muscutus] 1503 

Hs£41673 EST 1.901 

Hs.189095 ESTs; Weakly similar to zinc finger protein SALL1 [H.sapiens] 15 

Hs.114574 ESTs 1596 

H&14838 ESTs 1594 

ESTs 1594 

ESTs 1593 
Homo sapiens mRNA; cONA DKFZp564O0122 (from done DKFZp564O0122) 1591 

ESTs 1591 

ESTs 1587 

ESTs 1586 

Homo sapiens BAG clone RG1 14B19 from 7q31.1 1 586 

Hs241493 natural kflter-tumor recognition sequence 1.886 

Hs.14671 ESTs; HIp^lyshnBar to gene ERCC5 protein [Ksapiens] 1585 

H&43215 ESTs 1.884 

H&26471 Wnt inhibitory factor-1 1581 

Hs54988 ESTs 1581 

Hs.11 11 10 tiUrvcap (telethonin) 1.878 

Hs.17757 ESTs 1578 

Hs.4147 translocating cham-assodating membrane protein 1 .875 

Hs.109008 ESTs 1575 

Hs.177322 ESTs; Weakly similar to putative p150 [H.sapiens] 1575 

Hs. 124691 ESTs; Weakly simitar to trithorax homologue 2 [H.sapiens] 1575 

Hs.110855 ESTs 1575 

Hs.190474 ESTs 1573 

Hs.10669 ESTs; Moderately similar to KIAAQ400 [H^apiens] 1572 

Hs.197114 RNA binding protein 1572 

Hs527806 ras GTPase activating protein-like 1572 

Hs.1376 hydroxysteroM (11-beta) dehydrogenase 2 157 

AFFX control: 18S ribcsomal RNA 1568 

H&22385 ESTs; Weakly similar to K1AA0970 protein [H^apians] 1567 

Hs. 107932 DNAsegrnern; single copy; probe pH4 (6^^ 1567 

Hs5Q232 KIAA0552 gene product 1566 

Hs.93405 ESTs 1564 

Hs.191797 ESTs 1563 

Hs.112592 ESTs 1563 

HS54145 ESTs; Weakly similar to RAS-RELATED PROTEIN RAB-8 [H^apiens] 1582 

Hs.194720 ATP-blnding cassette; sub-family G (WHITE); member 2 1561 
Hs.182787 myosin; heavy polypept 6; cardiac muscle; alpha ((^iomyopathy; hypertrophic 1) 1561 

Hs.6841 ESTs 156 

Hs.107283 ESTs 1559 

Hs.17155 ESTs - 1558 

HsJ0488 ESTs 1557 

Hs58138 ESTs 1557 

Hs.187554 ESTs 1556 

HS51319 ESTs 1556 

Hs.168272 EST;Highr/smlrtoprotirmiblto^^^ 1553 

HS513Q3 ESTs 1552 

HS50150 ESTs 1552 

Hs.60015 ESTs 1551 

Hs.75912 KIAA0257 protein 155 

Hs.191648 ESTs 155 

Hs.193767 ESTs 1543 

Hs53810 ESTs 1543 

Hs5866 karyopr^rinalpha3(importinaJpha4) 1543 

Hs.121820 ESTs 1541 

Hs.193919 ESTs 1538 

Hs5463 ribosomal protein S23 1538 
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111837 R36447 Hs.24453 ESTs 1.635 

128727 M64174 H&50651 Janus kinase 1 (a protein tyrosine kinase) 1.834 

114439 AA018937 Hs.128629 ESTs 1.833 

1 02332 U35637 Human nebuBn mRNA, partial cds 133 

126579 W72979 Hs. 146082 ESTs 1.83 

102341 U37122 Hs3110 addudn 3 (gamma) 1.83 

114246 239848 Hs.12079 ESTs 1328 

131757 D17532 Hs316 DEAD/H (Asr>Glu-Ata-Asr^ ^ 1323 

108904 AA136521 Hs.71148 ESTs; Weakly similar to putative p150 [H.sapiens] 1323 

115084 AA255566 H&42484 Homo sapiens mRNA; cONA DKFZp564C053 (from done DKFZp564fc053) 1323 

131957 AA609008 Hs, 183232 ESTs 1322 

100131 D12485 Hs.11951 phosphodiesterase l/nucieotide pyrophosphatase 

1 (homologous to mouse Ly-41 antigen) 1 322 

124163 H30539 Hs.189838 ESTs 1321 

118204 N59859 H&48443 ESTs 1321 

107727 AA016021 Hs. 173091 DKFZP434K151 protein 132 

100357 D78156 Hs£41 548 RASp21 protein activator 2 132 

116295 AA489016 H&91216 ESTs; Highly similar to partial CDS; human putative tumor suppressor [H^apiens] 132 

124833 R54112 Hs.128697 ESTs 1317 

122587 AA453255 Hs.6968 ESTs 1317 

114359 Z41589 Hs.153483 ESTs; Moderately slmOar to H1 chloride channel [Rsapiens] 1315 

111289 N72253 Hs^38246 ESTs 1313 

110826 N3Q068 Hs.15347 ESTs 1312 

104106 AA422123 Hs.42457 ESTs 1311 

130043 AA055404 Hs.193953 ESTs; Weakly similar to II ALU SUBFAMILY J WARNING ENTRY !! [H^apiens] 1253 

115864 AA432Q80 Hs31200 ESTs 131 

129737 AA05614O Hs.122684 ESTs 131 

124477 N53158 Hs.102682 ESTs 1309 

100782 HG374O-HT4O10 BasicTrajiscripttonFactor2,34KdaSubunit 1306 

106101 AA421053 H&34395 ESTs 1306 

115479 AA287586 zs52h09.s1 NC(_CGAP_GCB1 H sapiens cDNA done IMAGE:701153 1304 

116104 AA456635 Hs.78524 ESTs 1304 

114173 Z39050 Hs21963 ESTs 1304 

132632 N59764 Hs3398 guanine-monophosphate synthetase 1303 

119135 R49548 Hs.169681 death effector a^rnairHantaining 1.802 

131559 N91087 H&28728 ESTs; Weakly similar to F55A123 [Celegans] 1301 

126922 AA177138 Hs.161671 ESTs 13 

117375 N25427 Hs.108812 ESTs 13 

103571 225535 H&211608 nudeoporin 153kD 13 

105978 AA406367 Hs.15973 ESTs 13 

125904 H22372 Hs.163586 ESTs 1.799 

133883 AA397915 Hs.77221 choCne kinase 1.798 

105777 AA348412 H&23096 ESTs 1.797 

110166 H19480 H&174309 ESTs 1.796 

105038 AA130273 Hs.7584 ESTs; Weakly similar to hypothetical protein; simflar to [H.sapf ens] 1.796 

105427 AA251330 Hs28248 ESTs 1.795 

115278 AA279757 Hs37466 ESTs; WeaMy similar to BACN32G11.d [Djnelanogaster] 1.794 

133104 L13698 Hs.65029 growth arrest-specific 1 1.794 

131170 N48674 Hs^3796 Human DNA sequence from done 1052M9 on chromosome Xq25. Contains the 1.792 

100136 D13540 Hs32868 protein tyrosine phosphatase; non-receptor type 11 1.791 

127263 AA331157 EST35035 Embryo, 6 week, subtracted (total cDNA) I Homo sapiens cONA 1.79 

114157 238878 H&24979 ESTs 1.79 

125601 AI096717 H&247043 K1AA0525 protein • 1.788 

118472 N66816 Hs.42179 ESTs 1.787 

112456 R63925 H&28464 ESTs 1.787 

130236 N69682 H&51957 SC35-interacting protein 1 1.786 

133297 AA600057 Hs.70266 KIAA0905 protein 1.784 

125650 R40096 Hs.176578 ESTs 1.784 

132056 T89386 H&38176 K1AA0606 protein; SCN Circadtan Osdaatory Protein (SCOP) 1.783 

129093 AA262710 Hs.108614 WAA06Z7 protein 1.783 

123176 AA489020 Hs.193424 ESTs 1.782 

106340 AA441792 H&22857 chord domain-containing protein 1 1.781 

100598 HG24634TT2559 Guanine Nudeotide-Binding Protein G25k 1.779 

104038 AA374532 EST66676 HSC172 cells I Homo sapiens cDNA5' end, mRNA sequence 1.778 

122235 AA436475 Hs.190104 ESTs 1.777 

105104 AA151771 Hs.76941 ATPase; Na+/K+ transporting; beta 3 polypeptide 1.776 

107601 AA004638 H&30223 ESTs 1.776 

131467 W68255 Hs27194 DKFZP434K1 71 protein 1.776 

118449 N66413 Hs.172466 ESTs; Weakly similar to K1AA0775 protein [H^apiens) 1.776 
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107869 AA034030 Hs.155212 methyfmatortyt Coenzyme A mutase 1.775 

115527 AA342079 H&252055 ESTs 1.775 

132471 T16305 Hs.49349 beta-site APP-cteaving enzyme 1.775 

105966 AA406105 H&5344 adaptor-related protein complex 1; gamma 1 subunit 1.774 

127548 AA373091 Hs.93832 Homo sapiens done 24483 unknown mRNA; parital cds 1.774 

106217 AA428379 Hs24870 ESTs 1.773 

131214 N26777 Hs.172635 ESTs 1-773 

106295 AA435664 H&8583 similar to APOBEC1 1.773 

, 106328 AA436705 Hs28020 K1AA0766 gene product 1.772 

124661 N93797 Hs3090 EphB1 1.772 

122988 AM79166 Hs.105633 ESTs 1.772 

115504 AA291846 Hs.42736 ESTs 1.771 

105168 AA180208 Hs.16606 ESTs; Highly similar to CGI-32 protein [H^apfens] 1-767 

129153 AA188618 Hs.181461 ariadne; Drosophila; homoJog of 1.766 

105829 AA398290 H&21965 ESts 1.764 

101811 M86917 HS24734 oxysterol binding protein 1.764 

100138 D13628 H&2463 angiopotelin 1 1.764 

124704 R07335 ye96c1.s1 Scares fetal liver spleen 1NFLS Homo sapiens cONA done 1.763 

122314 AA442257 Hs.192076 ESTs 1.762 

109865 H02566 H&191268 Homo sapiens mRNA; cDNA OKFZp434N174 (from done DKFZp434N174) 1.761 

106206 AA428069 Hs.88519 K1AA1 046 protein 1.758 

107135 AA620782 Hs.23247 ESTs 1.757 

105760 AA338960 HS28170 ESTs 1.756 

106288 AA435536 H&24336 ESTs 1.756 

103S68 AA304566 H&3542 ESTs 1.756 

129559 AA234945 Hs.11360 ESTs 1.756 

117885 N50112 Hs.47023 ESTs 1.754 

107032 AA599472 H&247309 sucdnate-CoA ligase; GOP-torming; beta subunit 1.754 

124807 R45963 Hs£33811 ESTs; WeaWy similar to ORF2 [Mjnusculus] 1.753 

100276 D42047 Hs.82432 KIAA0089 protein 1.753 

110924 N47938 yy84a09.s1 Soares_muttipIe_sderosis_2KbHMSP Homo sapiens cDNAciona 1.751 

133002 AF006082 H&62461 ARP2 (adin-related protein 2; yeast) homotog 1.751 

132530 AA455917 Hjl50785 SEra; vesicle trafficking protein (S. cerewsiaeHke 1 1.75 

110759 N21671 Hs.19025 ESTs 1.75 

106138 AA424515 Hs.33264 ESTs 1.75 

107348 U43701 Hs.184776 ribosomal protein L23a 1.75 

115867 AA432162 Hs.165986 DKFZP586B2022 protein 1.749 

135398 AA194075 Hsj99908 nuclear receptor coactivator 4 1.747 

113783 W18222 Hs.7041 ESTs; Weakly similar to !J ALU SUBFAMILY SQ WARNING ENTRY !l [H.sapiens] 1.747 

134898 X98330 Hs*0821 ryanodine receptor 2 (cardiac) 1.745 

132215 T10132 Hs.4236 KIAAD47B gene product 1.744 

104229 AB002346 Hs.61289 synaptojanin 2 1.743 

116166 AA461556 Hs.202949 KIAA1 102 protein 1.743 

115433 AA284252 Hs.58372 ESTs 1.743 

114908 AA236545 H&54973 ESTs 1.742 

127425 AA470941 Hs.143162 ESTs 1.741 

131089 Z38807 H&22870 ESTs 1.739 

113498 T88908 Hs.189746 ESTs 1.738 

116710 F10577 Hs.70312 ESTs 1.735 

127210 R51476 yg76f04j1 Soares infant brain 1 NIB Homo sapiens cONA done 1.733 

120554 AA279654 Hs.194524 ESTs 1.733 

129940 U18242 Hs.13572 caldum modulating iigand 1.732 

117023 H88157 HsU1105 ESTs * 1.731 

111700 R22212 H&23361 ESTs 1.731 

116911 H72240 Hs^9292 ESTs; Moderately similar to KIAA0745 protein [H-sapiens] 1.731 

106025 AA412063 Hsj6065 ESTs 1.728 

108626 AA101984 Hs.61697 G-protein coupled receptor 1.726 

111614 R12581 Hs.191146 ESTs 1.726 

134134 L76703 Hs.173328 protein phosphatase 2; regulatory subunit B (B56); epsflon isoform 1.725 

106886 AA489086 Hs.36545 ESTs 1.725 

117998 N52136 Hs.93828 ESTs 1.725 

121204 AA400422 Hs.55896 ESTs 1.725 

121342 AA404995 Hs.192480 ESTs 1.725 

131129 R27296 Hs.23240 ESTs 1.725 

116235 AA479181 Hs.186726 ESTs 1.725 

102423 U44754 Ha.1 79312 small nuctear RNA activating complex; polypeptide 1;43kD 1.724 

110273 H29050 H&24096 ESTs 1.722 

108758 AA127395 Hs.222414 ESTs 1.722 

110672 H88477 Hs.191178 ESTs 1.721 
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120271 AA176404 Hs.1 1 1092 ESTs; Weakly similar to ZINC FINGER PROTEIN 136 [H.saptens] 1.72 

100227 D28915 Hs.82316 interfercin-mduced; hepatitis C-assodated microtubular aggregate prot (44kD) 1.719 

129232 W69459 Hs.1 09655 sex comb on mtdleg (Drasophila)-like 1 1.719 

134663 W73367 H&8750 ESTs 1.717 

104902 AA055475 Hs.1 04143 dathrfn; light polypeptide (Ua) 1.717 

120582 AA281290 Hs.1 25287 ESTs; WeaWy similar to BC331 191 J [H^aplens] 1.717 

134891 F03517 Hs*0787 ESTs 1.716 

106219 AA428567 Hs26613 Homo sapiens rnRNA;cDNADKFZp586F1323 (from done DKFZp586F1323) 1.715 

116372 AA521311 Hs.13854 ESTs 1.713 

107570 AA001870 H&237323 N-aoetylgtucc^mine-phosphate mutase; DKFZP434B187 protein 1.713 

106198 AA427616 Hs.11803 ESTs 1.712 

125136 W31479 Hs.129051 ESTs 1.712 

104973 AA085676 Hs£763 KIAA0942 protein 1.712 

128710 J04813 Hs.1 041 17 cytochrome P450; subfamOy IIIA (niphedipina oxidase); polypeptide 5 1.711 

123994 020899 Hs.107127 Homo sapiens mRNA; cONA DKFZp564GQ22 (from done DKFZp564G022) 1.711 

127671 AA766511 Hs. 128848 ESTs 1.71 

116089 AA455933 Hs.41324 ESTs 1.709 

123337 AA504153 Hs.132797 ESTs; WeaWy similar to ORF YGL050W [S.cerevtsiae] 1.708 

123619 AA609200 Hs.162686 ESTs 1.708 

104781 AA026617 H&21610 ESTs; Highly simaar to BAll-assodated protein 1 ^.sapiens] 1.707 

115114 AA256468 HSJ8148 ESTs 1.705 

117852 N49408 Hs.1 38 102 KIAA0853 protein 1.705 

127644 T57570 Hs.77039 ribosomal protein S3A 1.704 

111359 N91273 H&27179 ESTs 1.702 

131721 136644 HS31092 EphA5 1.7 

132438 F08925 Hs.48810 ESTs „ 1.7 
132476 N67192 Hs.49476 Homo sapiens clone TUA8 Cri-du-chat region mRNA 1.7 
130990 F02488 H&21917 K1AA0768 protein 1.7 
128499 AA487503 Hs.100636 ESTs 1.698 
120780 AA342337 Hs241569 ESTs; Modtly smtr to II ALU SUBFAMILY SQ WARNING ENTRY II [H^aptens] 1.697 
132920 L06133 Hs.606 ATPase; Cu++ transporting; alpha polypeptide (Menkes syndrome) 1.696 
135037 U77948 Hs.184122 general transcription factor II; i 1.696 
110024 H11297 Hs.31050 ESTs 1.695 
134415 AA329274 H&S2911 protein tyrosine phosphatase type IVA; member 2 1.694 
102223 U24685 Hs.148226 Human anfi-fi cell autoantibody IgM heavy chain variable V-D-J region (VH4) 

gene; done El 1; VH4-63 nonproductive rearrangement 1 .694 

126712 AA205862 Hs.7942 ESTs 1.694 

101507 M27492 Hs£2112 interieukin 1 receptor; type I 1.692 

106291 AA435551 Hs.30824 ESTs 1.691 

116826 H58691 H&8215 ESTs; Weakly similar to double-stranded RNA-bindlng nudear 

protein DRSBP76 [Rsapiens] 1.69 

135339 D59269 Hs.127842 Homo sapiens mRNA fun length insert cDNA done EUROIMAGE 783648 1.69 
118250 N62602 yz75b6^1 SoaiBS_muItipte_sderosis_2MbHWSP Homo sapiens cONA dons 

IMAGE288851 3' similar to contains Atu repetitive element;, mRNA sequence 1.689 

106470 AA450116 Hs.188180 ESTs 1.688 

108203 AA057678 Hs.63408 ESTs 1.687 

119748 W70313 Hs.126906 ESTs 1.686 

116576 D51228 Hs.79404 neuron-sperific protein 1-683 

123035 AA481392 Hs.105166 ESTs 1.683 

126668 AA011616 Hs.184086 ESTs 1.681 

101512 M28209 Hs250716 RAB1; member RAS oncogene family 1.678 

102704 U76638 Hs54089 BRCA1 associated RING domain 1 1-677 

126218 AA256386 Hs.1 364 9 Novel human gene mapping to chomosome 13; simHarto rat RhoGAP 1.676 

111180 N87277 Hs5403 ESTs 1-676 

105937 AM04342 Hs.173531 ESTs 1-675 

114118 238520 Hs.1 75930 ESTs 1.675 

109203 AA190634 Hs.1 08787 endoplasmic reticulum membrane protein 1.675 

125245 W86608 Hs.7243 ubiquftin specific protease 24 1.675 

102906 X06956 Hs.75318 tubulin; alpha 1 (testis specific) 1.675 

125914 AA262925 Hs.180034 deavage stimulation factor; 3* pn>RNA; subunit 3; 77kD 1.674 

134294 U63289 Hs*1248 CUG triptet repeat RNA-bindlng protein 1 1.674 

109742 F10108 Hs.183333 ESTs 1.673 

134674 063876 HsJ7726 KIAA01 54 protein 1.673 

104079 AA402937 Hs.103238 ESTs 1.671 

107554 AA001386 H&59844 ESTs 1.671 

132439 AA243139 Hs.4863 Homo sapiens done 25088 mRNA sequence 1-669 
124515 N58172 Hs.109370 ESTs 1-668 
124300 H92575 Hs.105959 ESTs; Weakly similar to U ALU SUBFAMILY SQ WARNING ENTRY 11 [H.sapiens] 1.668 
126809 AA743475 Hs.171693 ESTs 1.667 
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106095 AA419547 Hs.11713 ESTs 1.664 

101754 M77142 H&239489 TIA1 cytotoxic granule-associated RNA-binding protein 1.663 

105188 AA192306 Hs.23926 ESTs 1.663 

113582 T91371 Hs.16824 EST U61 
119559 W38197 Accession not fisted In Genbank 1.661 

119961 W87535 --Hs59015 ring finger protein 9 1.657 

123255 AA490890 Hs.105273 ESTs 1.657 

111076 N59230 Hs.186574 ESTs 1.655 

113082 T40528 H&8246 ESTs 1.654 

119589 W44692 Hs.124177 ESTs 1.652 

104308 D53639 Hs.77904 rib osomal protein S26 1.65 

103073 X59417 Hs.74077 proteasome (prosome; macropain) subunit; alpha type; 6 1.65 

124424 N35314 Hs.107265 ESTs 1.65 

128890 AA096157 Hs.182364 ESTs; Weakly similar to 25 kDa trypsin Inhibitor [H.sapiens] 1.65 
119400 T92767 ye27d06.s1 Stratagane lung (5937210) Homo sapiens cDNA done 

IMAGE118955 3\ mRNAsequence. 1.65 

131631 AA486868 Hs.29802 sfit (Drosophila) homoiog 2 1.65 

118229 N62339 Ha 180532 neat shock 90kD protein 1; alpha 1.649 

118533 N67954 Hs.49413 ESTs 1.648 

130666 AA476307 Hs. 194035 KIAA0737 gene product 1.647 

103093 X60708 H&44926 dipeptidylpeptidase IV (C026; adenosine deaminase complexing protein 2) 1.647 

128667 U69140 Hs.103419 fasciculation and elongation protein zeta 2 (zygln It) 1.646 

112933 T15530 Hs221439 ESTs 1.646 

114546 AA056263 Hs.132747 ESTs 1j645 

126705 AA579377 Hs.180532 heat shock 90kD protein 1; alpha 1.644 

114399 AA007595 H&220937 ESTs 1.642 

116836 W79820 H&50854 ESTs 1.64 

100401 D85423 Homo sapiens mRNA for Cdc5 f partial cds 1^64 

1CS681 AA284865 Hs.171228 KIAA1 040 protein 1.639 

132526 AA460128 Hs.5074 similar to S, pombedim1+ 1.639 

133809 AA034002 Hs.76359 catalase 1.639 

115968 AA447083 Hs.134522 ESTs 1.637 

116370 AA521256 H&236204 ESTs; Moderately similar to NUCLEAR PORE COMPLEX 

PROTEIN NUP107 IRjwrvegicus] 1.631 

109644 F04477 H&204802 EST s; Moderately similar to GLYCERALDEHYDE 3-PHOSPHATE 

DEHYDROGENASE; UVER [H^apiens] 1.627 

103427 X97303 Haptens mRNA for Ptg-1 2 protein 1.627 

132186 T33888 H&22104O K1AA1 038 protein 1.626 

131428 U17838 H&26719 PR domain containing 2; with ZNF domain 1.626 

126638 AA649257 Hs.188602 ESTs 1.625 

114503 AA039568 Hs. 188083 ESTs 1.625 

121242 AA400857 H&97509 EST 1.625 

122414 AA446885 Hs.99087 ESTs; Moderately similar to ZINC FINGER PROTEIN 141 (H.sapfens] 1.625 

110632 H72344 Hs. 171635 ESTs 1.824 

111389 N95837 Hs.169111 ESTs; WeaWy sMar to U2A [Ojnelanogaster] 1.624 

112449 R63802 Hs.124186 ring finger protein 2 1.623 

113070 T33464 Hs.6298 ESTs 1.622 

107229 D59284 Ha34644 ESTs 1.618 

132710 W93726 Hs£5279 protease inhibitor 5 (maspin) 1j617 

124664 N94814 Hs33540 ESTs; Weakly similar to K1AA0765 protein [H^apiens] 1.617 

130166 AA350690 Hs.151411 K1AA091 6 protein 1.616 

125040 T78451 Hs.199961 ESTs 1.615 

132972 H39627 Hs.164967 ESTs; Weakly similar to U ALU SUBFAMILY SB WARNING ENTRY !l [H sapiens] 1.615 

115873 AA433916 Hs.90093 heat shock 70kD protein 4 1.611 

120408 AA235045 Hs.190151 ESTs 1.61 

120934 AA383773 Hs.191500 ESTs 1.61 

115259 AA279071 Hs. 13453 splicing factor 3b; subunit 1;155kD 1.609 

134330 020113 H&8185 ESTs; Highly similar to CGM4 protein [H-sapiens] 1.607 

115117 AA256492 Hs.49007 poiytA) polymerase 1.606 

125162 W44682 Hs.109896 ESTs 1.605 

103946 AA285246 Hs.1 11650 ESTs; Weakry simitar to Prt1 homoiog {Haptens] 1.604 

133389 AA166917 Hs.72639 ESTs 1.603 

115528 AA342301 Hs.53929 ESTs; Weakly similar to U ALU CLASS B WARNING ENTRY !I [H^apiens] 1.602 

129704 W81301 Hs.12064 ublquitln specific protsase 22 1.602 

109313 AA206800 Hs36276 ESTs; Moderately similar to zinc finger protein dp (KsapiensJ 1.601 

130457 U58091 Hs.155976 culGn4B 1.6 

123076 AA485211 Hs.190046 ESTs 1.6 

115113 AA256460 Hs.44810 ESTs 1.6 

117731 N46433 Hs.46609 ESTs 1j6 
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131798 
125370 
11491B 
114807 
105103 
125004 



AA504338 

X88098 

AA256743 

AA236813 

AA160805 

AA151593 

T60120 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



105656 AA282914 
110455 H52172 

119780 W72967 
126983 AA211537 



134675 
105431 
120187 
115830 
135069 
122997 
119707 
131934 
106141 
115271 
131468 
131165 
117273 
101569 
116127 
12)022 
117512 
106511 
116415 
127879 
125211 
114746 
122688 
116765 
130895 
114338 
111005 
128135 
112046 
132160 
111568 
127775 
115359 
121845 
127854 

120287 
114940 
126716 
134161 
125390 
115334 
113721 
114895 
119341 
108012 
130335 
134351 
133300 



118744 



115913 
107868 
134520 



AA250745 

AA252033 

Z40251 

AA428137 

M456311 

AA479295 

W67569 

D80948 

AA424558 

AA279422 

R27598 

R98173 

N21680 

M33772 

AA459703 

W90625 

N32157 

AA452865 

AA609204 

AA81Q215 

W72798 

AA135638 

AA456112 

H12636 

AA609826 

Z41366 

N53076 

AA913491 

R43365 

AA281770 

R10153 

HO4106 

AA281936 

AA425734 

AA769520 

AA167679 

AA243012 

AA031700 

U97188 

K95094 

AA281244 

T97831 

AA236177 

T62571 

AA039616 

AA156499 

R82074 

D51401 

AA490899 

N74075 

W20016 

AA436720 

AA025234 

N21407 



Hs.171857 

H&3238 

Ks.151791 

Hs.72324 

Hs.199832 

Hs.10130 



Hs,10176 



Hs.191381 



Hs.87773 

Hs.15036 

Hs56974 

Hs-86434 

Hs.93961 

Hs.106290 

Hs.44143 

Hs.34922 

H&9302 

H&5724 

HS27197 

Hs.23763 

Hs.43047 

Hs.182421 

Hs.79070 

H&58432 

H&82207 

Hs.206713 

Hs27973 

Hs.189079 

Hs.103177 

HS223756 

Hs.99410 

Hs.121585 

HS21015 

Hs.40109 



Hs.189143 

H&22273 

Hs.184081 

HS20561 

Hs.179902 

Ha*8914 

Hs.165066 



Hs.111114 

Hs.75928 

Hs£51962 

Hs.79440 

Hs.75187 

H&65300 

Hs.18190 

Hs.76591 

Hs.146388 

Hs.61933 

Hs.8454 

Hs.82109 

Hs.70333 

Hs.24462 

Hs.94293 

Hs.144228 

Hs55487 

Hs51260 

Hsi57325 



adenovirus 5 E1 A binding protein 

KIAA0092gene product 

ESTs; Highly similar to unknown [Rsapiens] 

ESTs 

ESTs 

yb68f02.s1 Stratagene ovary (#937217) Homo sapiens cONA clone 

IMAGE:76347 3'. mRNA sequence. 

ESTs 

yt85e8.s1 Soares_pineaLgIand_N3HPG Homo sapiens cONA done 
IMAGE231 11 3 1 similar to contains Alu repetitive element mRNA sequence 
ESTs; Weakly similar to hypothetical protein [Rsapiens] 
zn55d01 .rl Stratagene muscle 937209 Homo sapiens cDNA clone 
IMAGES62081 S, mRNA sequence, 
protein kinase; cAMP-dependent; catalytic; beta 

ESTs; WeaWy similar to !! ALU SUBFAMILY J WARNING ENTRY II [H.sapiens] 

ESTs 

ESTs 

ESTs; WeaWy similar to D ALU CLASS A WARNING ENTRY 11 [Haptens] 

Kelch motif containing protein 

ESTs; WeaWy similar to SNF2a|pha protein [Rsapiens] 

ESTs 

phosdudn-Bke 
ESTs 

KJAA0797 protein 
Max-interacting protein 
ESTs 

troponin C2; fast 

v-myc avian myetocytomatosls vtral oncogene homoiog 

ESTs 

ESTs 

UDP-GakbetaGlcNAc beta 1 'A- gaiactosyltransferase; polypeptide 2 

KIAA0874 protein 

ESTs 

ESTs; WWysmlrto cDNA EST EMBLD32579 comes from this gene [Celegans] 

ESTs 

ESTs 

ESTs; Weakly similar to reverse transcriptase [Rsapiensl 

ESTs; Highly similar to tetracycline transporter-like protein [M jnuscutus] 

KIAA0872 protein 

ESTs 

ESTs; Modrtiy smir to II ALU SUBFAMILY J WARNING ENTRY U [Rsapiens] 
ESTs 

seven in absentia (Drosophila) homoiog 1 
ESTs 

ESTs; WeaWy similar to NG22 [H^apiens] 
ESTs 

ESTs; Weakr/ similar to hypothetical protein [H-sapiensJ 

ESTs; Weakly similar to REGULATOR OF MITOTIC SPINDLE 

ASSEMBLY 1 [H^aplens] 

ESTs 

ESTs 

ESTs 

IGF-H mRNA-binding protein 3 

translocate of outer mitochondrial membrane 20 (yeast) homoiog 

ESTs 

EST 

WAA0887 protein 
inicrotubule-associatad protein 7 
ESTs 

protein kinase; cAMP-dependent; regulatory; type II; alpha 
1 



ESTs 
ESTs 
EST 

ESTs; WeaWy similar to ZINC FINGER PROTEIN 83 (H^apiens) 

ESTs 

ESTs 

ESTs 



1599 
1597 
1596 
1596 
1596 
1594 

1.592 
1589 



1587 

1586 

1584 

1584 

1584 

1581 

1581 

1581 

158 

158 

158 

1579 

1577 

1575 

1575 

1575 

1575 

1575 

1574 

1573 

1573 

1571 

1571 

1571 

157 

1568 

1568 

1567 

1567 

1567 

1566 

1566 

1566 

1566 

1566 



1564 

1563 

1562 

1562 

1561 

1561 

1559 

1558 

1558 

1558 

1558 

1557 

1557 

1553 

1553 

1552 

155 

155 

1.55 

155 
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109703 F09684 Hs^4792 ESTs; Weakly sbntor to ORFYOR283w[SxefBvislae] 155 

120288 AA187938 Hs.55189 ESTs; Weakly similar to F25B5.3 [Ceiegans] 1548 

106356 AA443277 Hs.31034 peroxisomal biogenesis factor 1 1A 1548 

129460 AA235627 Hs.11171 APQ5(autophagy5;S.cerevtslaeHte 1547 

133950 D11961 Hs.77823 ESTs 1546 

128172 AI400862 Hs.142607 ESTs 1546 

114162 238909 Hs.22265 ESTs 1545 

101803 M86546 Hs. 155691 pre-B-cet! iBukemia transcription factor 1 1544 

113617 T93630 Hs.17207 ESTs 1542 

104896 AA054228 Hs.23165 ESTs 1541 

114477 AA032013 Hs.144260 EST 154 

110731 H98653 Hs.188006 KIAAD878 protafn 154 

130367 Z38501 Hs5768 ESTs; Wkfy smlr to !! ALU SUBFAMILY SQ WARNING ENTRY !! [H^aplens] 1538 

130539 L07044 Hs£50857 Homo sapiens caidunVcalmoduIirHieDendent protein kinase II mRNA; partial cds 1538 

134921 W60186 Hs.169487 Kreisler (mouse) maf-related teudrte zipper homotog 1537 

130583 W24957 Hs.16281 ESTs; Moderately simflar to similar to Celegans protein 

eru»daJlnc»srrtdT20D3{H^aplens] 1537 

133723 AA088851 Hs.75744 S-adenosytmathionine decarboxylase 1 1537 

106450 AA449469 Hs.11859 ESTs 1536 

104120 AA429838 HsJ9519 KIAA1 046 protein 1536 

100533 HG1879-HT1919 Ras-Like Protein Tc10 1535 

130664 R09049 Hs.17625 ESTs 1535 

127122 AA279153 Hs.190049 ESTs 1535 

134264 T03391 Hs.8087 ESTs 1535 

132319 AA418662 Hs.44625 ESTs 1535 

115465 AA286941 Hs.43691 ESTs 1533 

125003 T59442 Hs.100445 ESTs 1532 

102273 U 30 388 Hs.75981 ublquin specific protease 14 (tRNA-guanine transgfycosytase) 1532 

121875 AA426299 H&98510 ESTs 1532 

114366 241747 Hs.469 succinate dehydrogenase complex; subunit A; ftavoprotein (Fp) 1531 

132944 AA054515 Hs.6127 ESTs; Weakry similar to prostate-specific transglutaminase [H^apiens] 153 

111199 N68210 HS29822 ESTs 153 

113494 T88878 Ha258738 ESTs 1529 

129515 AA490882 Hs.11 2227 ESTs 1528 

133124 AA156049 Hs.65490 ESTs 1528 

104785 AA027163 Hs.7942 ESTs 1526 

105595 AA279408 Hs35866 ESTs 1526 

130198 U67156 Hs.151988 rrtogen-activated protein kinase kinase kinase 5 1.526 

114297 Z40758 Hs.173091 0KF2P434K151 protein 1525 

112876 T03488 H&4842 ESTs 1525 

127500 AA525014 Hs.162115 ESTs 1525 

120519 AA258585 Hs.129887 cadherin 19 (NOTE redefmitton of symbol) 1525 

119859 W807Q2 Hs.58461 ESTs 1525 

129944 L00389 Hs.1361 cytochrome P450; subfamily I (aromafic compouncMnducajle); polypeptide 2 1524 

118864 NB9870 HjU2148 ESTs;Wealdys»rtoSu(P)p.inefaTOgaster] 1523 

123964 C13961 H&210115 EST 1523 

111676 R19414 H&166459 ESTs 1522 

128332 AI079523 H&134173 ESTs 1.522 

130455 X17059 Hs.155956 N^cetyitransferase 1 (ajytamine N-acetyttransferase) 1521 

125181 W58461 Hs.12396 ESTs 1521 
127093 AA768241 oa72dQ2*1 NCI_CGAP_GCB1 Homo sapiens cONA done 

IMAGE:1317795 3', mRNA sequence. 1521 

132156 AA157401 Hs.4113 S-aoerK^homocysteine hydrolase-fike 1 - 1521 

125303 Z39821 Hs.107295 ESTs 152 

132697 AA281951 Hs5518 Homo sapiens mRNA; cONA DKFZp566J2146 (from done DKFZp566J2146) 152 

117086 H93135 Hs.41840 ESTs 1519 

113355 T79203 Hs.14480 ESTs 1518 

108621 AA101811 Hs.69506 ESTs 1516 

109384 AA219172 H&86849 EST 1518 

128510 X94703 Hs.100816 RAB28; member RAS oncogene family 1517 

132868 N77151 Hs.61638 myosin X 1.515 

117035 H88798 Hs.41182 ESTs 1515 

116781 H22985 Hs52132 ESTs 1513 

108677 AA115629 Hs.118531 ESTs 1513 

130214 H78003 Hs.15266 ESTs 1513 

134700 AA481414 Hs.8868 golgl SNAP receptor complex member 1 1512 

116618 D80783 Hs.45224 ESTs 1508 

126257 N99638 tumor necrosis factor receptor superfamHy; member 10b 1508 

125859 AA806808 Hs. 1 18797 ubiquttin-conjugating enzyme E2D 3 (homologous to yeast UBC4/5) 1.508 
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113837 W57698 Hs.8888 ESTs 1507 

114317 241038 Hs.469 succinate dehydrogenase complex; subunit A; flavoprotein (Fp) 1507 

100311 D50640 Hs.184653 phosphodiesterase 3B; cGMP-inhlbted 1507 

126802 AA947601 HsJ7056 ESTs 1506 

128661 R82837 Hs.103329 KIAA0970 protein 1.506 

134194 AA233231 H&79828 ESTs 1506 

108953 AA149652 Hs.42128 ESTs 1.504 

133240 D31161 Hs.68613 ESTs 1.502 

132671 X76302 Hs54649 putatwenuctelcaddbhdingprotebi RY-1 1501 

132609 Z48923 Hs53250 bone morphogenettc protein receptor; type II (serineythreonine kinase) 1501 

105574 AA278678 Hs.258567 ESTs 15 

113718 T97782 Hs.256268 ESTs 15 

127824 AI208365 Hs. 127811 ESTs 15 

130132 U55936 Hs. 184376 synaptosomaf-associated protein; 23M) 15 
127394 AA453224 ESTs; Weakly simitar to !! ALU SUBFAMILY J WARNING ENTRY U [Haptens] 15 

100485 HG1111-HT1111 Ra*Uke Protein Tc21 15 

101078 L04510 Hs.792 ADP-rfoosyiation factor domain protein 1 ; 64kD 15 

128611 AA456845 Hs.102471 KIAA0680 gene product 15 
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TABLE 12A shows the accession numbers for those primekeys lacking unigenelD's for 
Table 12. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from 
Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity 
using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank 
accession numbers for sequences comprising each cluster are listed in the "Accession" 
column. 



Pkey. Unique Eos probeset Identifier number 

CAT number. Gene cluster number 

Accession: Genbank accession numbers 



Pkey CAT number Accession 



1(38535 119811.1 AA084524 AA339253 AW966289 

1 1 7040 46956 J AW97D600 AA5G3323 H89218 ATO86Q31 H891 12 

100782 18457 J AA355435 NM_001516 Z30093 T28405 AW949486 AA461142 AA4 10532 AI652073 AA521208 AI970141 AI968234 Al 026 102 

AA713583 AW135876 AA936614 AA770300 AI242635 AA377033 AW96Q263 AW607683 AI273603 AA410287 A1040513 
AA460838 AJ80391 6 AW294095 AW448680 AW798577 AW675048 BE542116 AL120521 

100819 3022L1 L34840 NM_003241 U31905 A1546931 AI791616 AI973065 AI792321 AI546937 AI685880 AI732B35 AI682360 AA420653 

AA564047 A1682323 A1824614 AI659889 A1680052 AI970887 AI623108 AA420692 AI418074 AA831018 AI810595 AW291463 
AW449930AI668908AI970818 

100824 5.36 AI393237 A1521317 AI761 348 AF025841 D43968 AW994987 L34598 AFQ25841 D89789 D89788 D89790 AW998932 

AI971742 A131Q238 X80976 AW139668 AW674280 AI365552 AA877452 AV657554 C75229 AA376077 AI798056 AW609213 
W25586 H30149 BE075089 BB075190 AW580858 H99598 AA425238 AA133916 AW363478 BE158121 BE158127 
AW467960 BE158135 BE158126 BE158145 N92860 AA847246 A/961688 AI361423 AA878154 AA043767 AJ863712 
AI559226 AW339007 AJ371266 AI368901 AA046624 AA1 34739 AW449154 AA130232 AI458720 AA95251 1 AI700627 
R70437 AW0O4OO8 AA045229 A1671572 H99599 AAD43768 AI685454 AI871685 N29937 X90977 AA524240 All 421 14 
A1825750 AI567805 AI631365 AI347B93 AA134740 F20669 AA046707 AW793216 AW963298 AW959380 AA363265 
AJ7B4593 AI268201 R59451 AV657618 Al 695588 

125004 264197J BE312163 AJ230798 AA374482 AI926059 AA622653 AI860704 BE139185 AW296884 T60238 T60120 

102313 27608J U33921 AI190489 AA573311 

102337 553.1 Al 81 4663 AA806761 AA765241 AA019317 AA092255 AA035405 TB5079 AA8901S1 AJ373959 T85080 BE153728 AA740848 

BE0806B2 AL0481 37 AW18231 6 AI699468 AW274481 AW407538 AA306562 AW950024 AW949943 AL045703 AWB43196 
W25132 BE612794 AA304266 AW958054 H25673 AV646563 AV646573 BE172990 AW593488 AA3851 81 AA164998 
AI246476 AA345406 AJ277554 AA134749 AA856624 BE613247 AA299003 AL048 138 AA028121 T92510AI923835 
AW020440A1401594 AI889401 N93290 AA044247 AAQ28100 AI582845 AA81 1151 AI741811 AI925878 AA448277 AA172221 
AI214783 BE220793 AA022746 AI082882 AA022849 AI928385 AA573472 AI420686 AW0728Q2 AI799493 AI873506 
AJ468977 AJ1 92079 AJ468976AA044272 AW015701 AW316979 AA933042 AA609017 A131B393 AJ424571 A1934945 
AA172023 AW050917 AA846180 AA1 34748 AI003947 AI766769 AW006697 AA653517 AW575680 AI474214 AA401 478 
U36922 AA927064 AA868000 D62654 T91745 AW500202 AA194764 AA746346 AA130464 AW1 17498 AA054526 N26432 
HQ2534 H04964 AW303367 BE300931 AI21 8049 A1208073 AW182749 AA983630 AI147585 AA194765 AA054534 AA922720 
AI436585 AI346535 AA134269 AA280923 AA897422 AA019559 AW274010 AA035406 AA917879 H99327 W32908 AI216046 
AW496823 AA019414 H82288 VV35284 AJ93SS21 AI7671 13 AA866177 AW367874 H82398 AF032885 AW300151 AW467069 
AA809346 Al 183507 AI494178 AA872752 Al 63 1631 UQ2310 NWL002015 AAB15006 AI382453 AW197658 AI761654 
A/804398 A/382221 AI81 3640 A/439635 A/523901 AW517242 A1221705 AW2981G4 AW204560 AW573095 AWQ28783 
AW014650 AI766744 AI808294 AI698758 A/041 809 AI766667 AI479103 AA872797 AA769305 AA765080 AA334166 
AI472322 

124704 292319.1 R07335 R07640 

116988 185904J AW953679 AW953680 AA244436 H82527 AA361046 AA244483 H82526 

124825 330773.1 . AA501669 R52088 

1 10455 46874.1 H52576 AF085971 H52172 

126257 182217 J N99638 AW973750 AA328271 H90994 AA558020 AA234435 N59599 R94815 

125624 154135J AW968363 AA465492 R34539AA1 65411 
104038 264235J AA374532 AA421255 

103427 4389SJ BE514383 AA071273 AW247987 AW673286 BE312102 AW749824 BE071985 AW577383 BE071945 BE072005 AW577355 

BE071965 AW239231 BE072000 BE071960 AW577360 AW749830 AW373020 X97303 AW999522 BE000192 BE562219 
BE266655 BE264970 

104142 113242 J AA074713AA447006 

127093 47721.1 AW977549 AA256038 AL365415 AW500455 AA768241 AW968097Z17849 AA256104 
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125873 10492.1 
125954 4457J 



10 125992 1589048J 
127210 15307.6 



15 



127263 232161.1 
20 135197 29440.1 



127394 304844.1 
126879 1860J 
126983 171841J 
120470 188975.1 
127854 443883.1 
121367 280429.1 
106320 6435.1 
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115479 201515 1 
101026 11075.1 

100401 24827.1 



130542 28089.3 



100485 30576.2 

108345 112277 6 
100522 19669.1 

100533 32905.1 

100598 23902.2 



102332 14745.3 

118250 genbanK_N626Q2 

103678 entra*_ZB4483 

119400 genbankJ92767 

119559 entn*_W38197 



AW271838 AL133605 C01646 H29959 AA999896 D60676 AW999454 AW961 176 AA315244 H14437 AW3851 18 N46512 

AW272Q21 AI768516 BE466421 AI082809 A! 804454 AA905101 AW173368 N38942 AW6141 69 AI080463 N29489 AI500550 

AA994475 AA614464 AA707368 AA593145 AA569473 AW62781 5 AI828244 N63226 N42300 

NM.016353 AB023584 W44753 R09585 AA382865 R23772 AI814257 AA974046 AK001608 AI935638 AW440609 AI420022 

AA777386 AA806969 AI554876 A1584006 AI668556 AI688634 AI697997 AI014540 AI806683 AI741202 AW263154 

AW297238 AI149951 AI589076 AW082158 AW614265 AA931887 AA781969 RO9490 AA484643 AI207121 AI088390 

A1538065 AI619547 A1741 925 AT7Q2846 H40846 R93943 AW747979 AA461348 U30163 AA326023 AB35992 AW242870 

AI244025 AI222558 W38425 AW473630 AI624599 A1921226 A1683152 A1096458 AJ 123822 AW170802 C16447 AJ337674 

D25726 AW339368 AW771259 AA461 174 

H48372W01626 

AA30527BAA223833 

110924 6443.1 AW058463 AF1 95766 AA6801 45 T86901 W60373 W6Q281 NMJW7222 AF106862 AKW0795 AA167188 
AWB84503 AW891313 AWB91332 AW891312 AI984924 AI123518 N75170 AA131614 H25330 AI913358 AI742277 W25576 
R58771 AW445159 AW888628 AW888627 AW274674 AI088482 N52314 N34282 AW001769 AI338943 TB5784 AI288963 
AW468676 AW237528 H25289 N7169Q AA610126 Al 143458 A1082599 N49144 AA854773 AW663411 AW610151 N47938 
AW601626 AA167189 AA918304 AA805205 BE069496 AA652836 BE069499 AI699298 AW249926 AW888578 BE567635 
T10726 AW604715 D54245 D53062 D55610 D55555 AA301378 AI133498 N77788 A1936320 AW090734 A1269977 N50828 
AA550814 AI421993 AI005384 N50813 D6Q292 D59349 AA131710D81698 D81699 
AA331 156 AA331 157 AA331 155 

U76456 NM.003256 AF057532 AA193414 AW293304 AW963376 AA31 3095 AI359841 AI969312 AI080163 AW448926 

AI671 136 BE466399 AI637957 A!671673 AW196583 AW071 635 AJ634427 AW296872 AW292470 AA193650 

BE161832 AA453224 AA485772 

D90391 M55575 AI652268 AA719776 

AA524886 AW971347 AA211537 

AW971327 AA524988 AW628653 AA251797 

AW976798AA769520 

AA432071 AA405646 AW000908 T16347 

AB028957 AL120001 AI267678 H10928 R19844 AW970334 AA393182 F05472 F1 171 1 H09908 N5Q250 AI81541 1 BE463679 
D61468AW970253 D60889C15548D61011 D60867AI815795AA534831 D81386 AW235039 A3382158 D81 174 AA41 6899 
AA852310 H09789 H10929 H09813 F09369 R44721 D51515 Z38456 R14004 T66255 F12148 F12139 AW351702 M85350 
A1018713 AW972450 AW972645 AA514964 T66172 F09785 F09776 AA436608T05327 T07118 AA339352 
AW301 608 N46706 AA649093 AA287595 AW81 1753 AA287596 N39260 

NM.O01874 J04970 791426 AW205201 784979 AA255727 AA847837 R02164T91339 AV651884 AV651835 AV651350 
AV6501 18 AV651338 A12720Q2 AJ367796 AA830651 AA2621 12 AW151 198 

AU076696 AA21 9720 AL135197 AA305877 N56376 AA318063 AA130725 AW954903 BE541230 AW383312 U86753 D85423 
A1679458 AI1 22932 AB007892 AI583919 BE160134 F08104 R34903 F1 3440 AA095444 AA262453 AA1 91036 R17895 
T81266 BE149776 A1279537 A11431 13 AA361072 AW959030 AW26B817 AA81 1533 BE275179 AI221677 T65147 R49293 
AA249176 BE000290 AA768053 F09494 BE092645 BE172099 Z41 177 AA044750 AI909768 BE140795 BE140574 AWB45210 
AW752452 BE243244 AA843664 AI300080 BE169032 AW189979 BE004869 AA621872 AI951772 AI678897 A1926598 
N62613 AB50912 AW608791 AI309602 AIS83138 AW875592 A1655073 AW875626 AA130606 AI370827 C75528 C75554 
AW263335 AI344426 BE004788 AA576220 AA604824 A1431405 AA749378 R38882 AW955075 AA1TO821 C75657 
AA219672 AW768408 R43141 AI431414 AA483343 AI673792 T17294 AW7701 87 N74285 AI476404 AI088268 AA654152 
AW974864 BE617311 BE243328 BE168049 

U64675 AW167507 AW 167503 BE218568 AA779360 W85722 AL044843 BE159404 AF012066 AW89861 1 AW898610 
BE 159405 BB092191 AW890826 AW369841 AW368064 AW606702 AL044731 R82691 AA419346 AA416558 H 9 6045 
AL040450AI640531 AI803434 AUM6613 AW855784 AW362469 AL048881 AL049015 AA094272AA888908AA417294 
AW237786 R59793 AL044916 D82402 A1216854 AI079342 H96406 AL037845 A1915900 AA9721 33 AI478783 T31074 
Z21 135 221396 AA352182 R13918 AA430178 C17811 AI371824 AI742256 M926801 N79156 AA350610 AA081971 N83639 
R35544 AA312292 AW952080 N42322 AA171 957 AA565297 R89207 AA504106 AK3Q782 AA826482 AI301579 T36241 
AW966618 Z28426 AL043480 A1124636 AA393449 T19504 AWB87823 AJ289814 N53979 AL043571 AI632764 A1859613 
A1986308 A1683212 AI984499 AJ133258 C05898 AW512761 AI041260 BE466240 Z19161 AI351190N67549 AI373374 
AA400873 AW440914 AW514879 AA770146 AI358754 R51 1 13 A1283773 AA649886 T30543 D54358 R37750 T03358 
T15451 T15880 AA999689 N67396 AI056289 T85597 N62441 R89099 R00035 T85596 R61335 R00128 N63359 AI535964 
AI207768 M31468 NM.012250 W0 1322 AA253280 AA253233 AA293148 AW5B2106 R79880 AA459547 AA363459 
AA234396 N31669 H44468 AA434587 AW363088 AW993541 
AA070906AA070934 

X51501 NM.002652 Y10179 J03460 AI791618 AI821473 AA916588 AA564296 AA9161 10 AI972286 AM20470 AI568790 
A1597724 AW205207 AI659305 AI791620 AA532383 AI821 475 AA526498 

NM.012249 M31470 AL043108 AA262561 AA178883 729433 AA313329 W48807 AW404323 AA453560 AW403227 K94816 
W17101 M165152 W23989 AA091310 

AL121734 054896 AA424269 BE242906 AA362118 BE018454 AI280348 AL048769 M35543 AA757734 AI128865 H20289 

H23728AI203445 H41481 H18237 H44081 H92839 AI928621 H75675D51148 AI796198AW390453 O55579 D54145D53996 

D54015 R37664 H17541 AA668681 765061 R15867 AW468123 R 16049 H69030 AA054226 H16070 F09655 R92144 703521 

R05473 H92840 AA018186 R91707 

U35637AA1 12989 Z1 9308 

N62602 

Z84483 

T92767 

W38197 
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TABLE 13: shows genes, including expression sequence tags, up-regulated in prostate tumor 
tissue compared to normal prostate tissue as analyzed using Affymetrix/Eos Hu02 GeneChip 
array. Shown are the ratios of "average" normal prostate to "average" prostate cancer tissues. 
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Pkey: 


Unique Eos probeset identifier number 




ExAccn; 


Exemplar Accession number, Genbank accession number 




UnigenelD: 


Unigene number 




Unigene This; 


Unigene gene title 




R1: 


Background subtracted norma) prostata ; prostate tumor tissue 




Pkey ExAccn 


UnigenelO Unigene Title 


R1 


333516 


CH22_roBJES.173 1 


0.028 


337954 


CH22_EMAC005500.GENSCAN.96-3 


0.029 


332496 R73299 


Hs.204354 ras homobg gene family; member B 


0.03 


337944 


CH22 EMAC0O550O.GENSCAN39-7 


0.033 


334111 


CH22IfGENES^30_10 


0.033 


333657 


CH22^FGENES^41_2 


0.034 * 


327718 


CH.04_hsgi|6525284 


0.034 


336355 


CH22 FGENES817 5 


0,035 


322011 AL 137354 


EST cluster toot in UnIGenfl\ 


0.035 


336377 


CH22_FGENES.821_5 


0,036 


300254 AW079607 


Hs.1 88417 ESTsfWeakly similar to ZnT-3 [H^apiensJ 


0.037 


330096 


CH 19 n2oii601527B 


0.037 


335191 


CH22 FGENES-507 6 


0.038 


334040 


CH22 FGENES.322 fi 


0.039 


333586 


C H22_FG ENES204J2 


0.04 


333295 


CH22 FGENES132~2 


0.042 


3133215 AI088120 


H&.12232S ESTs 

n* 1 r 6J6B tO 1 O 


0.043 


329517 


CH 10 n2 0(13983513 

wit. 1 \1 y IjOWJ 1 o 


0.043 


333403 


CH22_FGENES.144_21 


0.043 


335226 


CH22~FGENES513 11 


0.044 


335976 


CH22 FGENES652 11 


0.045 


333637 


CH22 FflFNFR PPfl 2 


0.046 


334582 


CH22 FGENES407 S 


0.046 


336437 


CH22 FGENES 826 4 


0,047 


337461 


CH22 R5PMPS 782-1 


0.047 


302892 N58545 


Hs RQ75 hktnno rtoa rah/la 3 


0.049 


338689 


CH22 EMAD005500 GENSCAN 475-3 


0.049 


334721 


CH22_FGENES.42L32 


0.049 


305867 AA864572 


EST singleton (not in UniGene) with axon hit 


0.049 


335498 


CH22.FGENES.57U 


0.05 


311596 A1682088 


H&223368 ESTs 


0.05 


326959 


CH-21Juigi|6469838 


0.051 


311688 AW025661 


HS540090 ESTs 


0.052 


317298 A1922374 


Hs.158549 ESTs 


0.052 


332984 


CH22_FGENES54j6 


0.052 


321039 AW247083 


EST cluster (not in UniGene) 


0.053 


335844 


CH22 FGENES.623.4 


0.053 


325371 


CH.12 hsgi|5866920 


0.054 


335667 


CH22_FGENES.590_18 


0.054 


333635 


CH22.FGENES228 2 


0.054 


336736 


CH22J=GENES.110-2 


0.055 


335893 


CH22_FGENES.635J 


0.055 


333170 


CH22.FGENES.94J 


0.055 


329768 


CH.14ji2gi|6015501 


0.055 


334030 


CH22_FGENES.320_2 


0.055 


323359 AA234172 


Hs.137418 ESTs 


0.055 


300453 AW051431 


Hs.1 13029 ribosomal protein S25 


0.055 


334262 


CH22.FGBIES.367J2 


0.055 


306590 AI000246 


EST singleton (not in UniGene) with exon hit 


0.055 


331087 R22520 


H&23398 ESTs 


0.055 


338620 


CH2^EM^OJ05500.GENSCAN.450-18 


0.056 


339045 


CH22_DA59H18.GENSCAN.28-5 


0.056 


308023 A1452732 


EST singleton (not in UniGene) with exon hit 


0.057 
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339067 CH22_DA59H18.GENSCAN.33-3 0.057 

335689 CH2^FGENESi86.4 0.057 

339069 CH22_DA59Hl8.GENSCAN.33-5 0.057 

338176 CH2^EMAC005500.GENSCAN21W 0.057 

5 328159 CH.06_hsgi|5868065 0X158 

335655 CH22_FGENES.590_6 0.058 

336371 CH22_FGENESi20_1 0XB8 

336558 CH22_FGENES.842_3 0.(59 

337738 CH22_EMAC000097.GENSCAN.1004 0.059 

10 334273 CH22_FGENES.369_2 0.059 

335889 CH22.FGENES.633 3 0.059 

327807 CH.Q5_hsgi|5867968 0.059 

333315 CH22_FGENES.138_7 0.059 

338825 CH22_DJ246D7.GENSCAN.4-6 0.06 

15 337612 CH22_C20H12.GENSCAN.22-5 0.06 

333897 CH22J=GENES293jl 0.06 

335990 CH22LFGENES.655 4 0.06 

334264 CH22_FGENES367_15 0.06 

338653 CH22_EMAC00550aGENSCAN>J60-39 0D61 

20 322303 W07459 EST cluster (not in UniGene) 0X161 

333498 CH22_FGENES.168 8 0.061 

336522 CH22.FGENES.839 3 0.061 - 

301357 AW295677 Hs.137840 ESTs; Moderately similar to HOMEOBOX 

PROTEIN SK1 [Haptens] 0.062 

25 305917 AA876469 Hs.181357 lamlnin receptor 1 (67kD; ribosomal protein SA) 0.062 

336143 CH22_FGENES.705_5 0.063 

333493 CH22_FGBIES.168_2 0.063 

332533 M99487 Hs.1915 folate hydrolase (prostate-specific membrane antigen) 1 0.063 

325844 CH.16_h9gi|6552453 0.063 

30 336402 CH22J=GENES.823 17 0.063 

335767 CH22.FGENES.607J 0.064 

301693 T80334 EST cluster (not in UniGene) with exon hit 0.064 

324019 AW177009 EST cluster (not in UniGene) 0.064 

305801 AA845997 EST singleton (not in UniGene) with exon hit 0X84 

35 335188 CH22_FGENES.507_3 0.065 

337533 CH22.FGENES.828-2 0.065 

333311 CH22_FGENES.13B_3 0.065 

335668 CH22.FGENES.590J9 0.065 

306786 A1041589 EST singleton (not in UniGene) with exon hit 0.066 

40 306365 AA962086 EST singleton (not in UniGene) with exon hit 0.066 

306249 AA933840 EST singleton (not in UniGene) with exon hit 0.066 

335018 CH20_FGENES.474_6 0.066 

333594 CH22_FGENES.210_3 0.066 

333900 CH22_FGENES.293_7 0.066 

45 325207 CH.10_hsgi|6552430 0.067 

329888 CH.15_p2gil6067149 0.067 

326238 CH.17_hsgij5867260 0.067 

333658 CH22_.FGENES.241 4 0D67 

335809 CH22_FGENES.617_6 0.068 

50 307427 AI243437 EST singleton (not in UniGene) with exon hit OD68 

318428 AI949409 Hs.224583 ESTs 0.069 

327005 CH.21_hs gi|5867664 0.069 

330463 HG998-HT998 Suffotransferase, Phenol-Preferring " 0.069 

333318 CH22_FGENES.138_10 0.07 

55 333313 CH22_FGENES.138_5 0.07 

325937 CH.16Jisgi|5867132 0.07 

335663 CH22_FGENES590J4 0.07 

335349 CH22_FGENES.539_2 0.07 

303396 AA224470 Hs.25426 ESTs; Weakly similar to unknown [H^apiens] 0.07 

60 332603 N66681 H&33470 ESTs 0.07 

333310 CH22_FGENES.138_2 0.071 

309924 AW340812 EST singleton (not in UniGene) with exon hit 0.071 

336340 CH2^FGB€S.814J5 0.071 

308025 AI453365 Hs.1 72928 collagen; type I; alpha 1 0.071 

65 306805 AI055966 EST singleton (not in UniGene) with exon hit 0.071 

335499 CH22.FGENES.571J 0.071 

328669 CH.14_n2gi|6272129 0.071 

321666 D28390 EST duster (not in UniGene) 0.071 

338174 CH22_EM:AC005500.GENSCAN.219-2 0.072 
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336556 

305451 AA738105 

336684 

326943 

333947 

333214 

331917 AA446572 

339102 

328122 

332250 N62712 



331756 AA291468 
335193 

317729 AA971718 
304515 AA458708 
313644 AI565766 
326145 
336394 

306516 AA989542 
300629 AA152119 

333160 
337490 

305403 AA723746 
331747 AA281765 
332792 

330513 M81057 
308905 AIB59636 
337419 



30 333459 



35 



40 335331 



327879 

305830 AA857665 
302928 AL137719 
304321 AA136698 
326390 
335230 



304753 AA578840 
301863 AI418863 



335611 

305060 AA635771 
306051 AA905130 
308289 AI571211 
334365 
335496 

332634 S38953 

337824 
335822 
334758 

309641 AW194230 
333064 



331809 AA402482 

326138 

328304 

330570 U60276 
334305 



325839 
333531 

330385 AA449749 

323305 AA811351 
331698 Z39929 



CH22_FGENES.842J 0.072 
Hs.140 Immunoglobulin gamma 3 (Gm marker) 0.072 

CH22.FGENES.46-1 0.072 

CH21JisgI|6004446 0.073 

CH22.FGENES.303 1 0.074 

CH25LFGENES.104 5 0.074 
Hs.174007 ESTs; Moderately similar to !!!! ALU SUBFAMILY J WARNING 0X74 

CH22_DA59H18.GENSCAN.44-9 0.074 

CHX6_hsgI|5868031 0J375 
Hs.226223 KIAA0618 gene product 0.075 

CH.07_hsgiJ5868471 0.075 
Hs.98504 ESTs 0.075 

CH22.FGENES507J 0.076 
Hs.128141 ESTs 0.076 
H&251577 hemoglobin; alpha 2 0.076 
Hs.124960 ESTs 0.076 

CH.17_hsgl|5867204 0.076 

CH22.FGENES.823_6 0.077 

EST singleton (not In UniGene) with exon hit 0.077 
Hs.155101 ATP synihase; H+ transporting; nrfiochondrial F1 complex; alpha subunit; 

isoform 1; cardiac muscle 

CH22J=GENES.91_2 

CH22_FGENES.799-5 

EST singleton (not in UniGene) with exon hit 
Hs.193689 ESTs 

CH22_FGENES-3_2 
Hs. 180884 carboxypeptidase B1 (tissue) 
Hs.8102 ribosomal protein S20 

CH22J=GENES.759-4 

CH22_FGENES.157J 

CH22_FGENES.440_3 

CHJLhsgi|5868569 

CH.06_hsgj|5868142 

EST singleton (not in UniGene) with exon hit 

EST cluster (not in UniGene) with exon hit 
Hs.1 13029 ribosomal protein S25 

CH.19Jisgi|5867340 

CH22_FGB^ES.514_2 

CH22_FGENES>»2J 

CH22_FGENES£35_4 
Hs.77961 nrajorhistoconpa&flitycomple^ 

EST cluster (not in UniGene) with exon hit 

CH2^.FGENES.842_6 

CH22_FGENES.563_5 

EST singleton (not in UniGene) with exon hit 

EST singleton (not in UniGene) with exon hit 

EST singleton (not in UniGene) with exon hit 

CH22_FGENES.378_13 

CH22_FGENES£71_4 

Human unidentified gene complementary to P450c21 



CH22_EM:AC005500.GENSCAN.13-18 
CH22_FGENES.619_7 
CH22^FGENES.428_7 
HS253100 EST 

CH2?_FGENES.75 7 
CH22_EM:AC005500.GENSCAN.477-25 
HSJ97312 ESTs 

CH.17Jisgi|58672Q3 
CH.07Jsgi|6004478 
Hs.165439 arsA (bacterial) arserute transporter; ATP-binding; homolog 1 
CH22_FGENES.373_8 
CH22_FGENES.632_3 
CH.16_hsgi|6552452 
CH22LFGENES.175J8 
Hs.31386 ESTs; Highly simitar to secreted apoptosls related protein 
1 [Rsapiens] 

Homo sapiens clone 24812 mRNA sequence 
ESTs 



H^25307 
Hs.65843 



0.077 
0.077 . 
0.077" 
0X177 
0.077 
0.078 
0.076 
0.078 
0.078 
0.078 
0.078 
0.078 
0.079 
. 0.079 
0X79 
0.079 
0.079 
0.08 
0X6 
0.08 
0.06 
0.081 
0.081 
0.081 
0.081 
0.082 
0.082 
0X82 
0X82 

0.082 
0.082 
0.082 
0.082 
0.082 
0.083 
0.083 
0.083 
0.083 
0.083 
0X83 
0.083 
0X83 
0.083 
0.084 

0.084 
0X84 
0.084 
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335888 CH22.FGB1ESX33J2 0.084 

306008 AA894390 EST singleton (not in UniGene) with exon hit 0.084 

334249 CH22_FGENES.3S5_15 0X184 

318303 AW451197 Hs.113418 ESTs 0.084 

5 330171 CH.O?_p2gi|6648220 0.084 

336682 CH2*J=GENES.4M 0X85 

320506 AI815668 Hs.157476 sucl -associated neurotrophic factor target 2 

(FGFR signalling adaptor) 0.085 

316974 AI740721 Hs.128292 ESTs 0.085 

10 336492 CH22_FGENES.832_9 0.085 

335750 CH22_FGENES.602_4 0.085 

335676 CH22.FGENES.594J 0X86 

336093 CH2*J=GENES.691_2 0.086 

310932 AI933861 Hs£22852 ESTs 0.086 

15 335160 CH22_FGENES.502_4 0.086 

334306 CH22_FGENES.373_9 0.086 

334793 CH2^FGENES.433„5 0X86 

333936 CH22_FGBJES.301_2 0X87 

336413 CH22_FGENESX23_35 0X87 

20 333775 CH22_FGENES.272__6 0X87 

335971 CH2^.FGENESX52jt 0X87 

301737 AI815981 EST duster (not in UniGene) with exon hit 0X87 

339101 CH22_DA59H18.GENSCAN.44-6 0X87 " 

327612 CH.04Jhsgi]6525263 0.087 

25 326241 CH.17_frsgi|5867260 0X88 

338386 CH22_BA:AC005500.GENSCAN.331^ 0X88 

327762 CHX5Jisgi)5B67961 0X88 

305266 AA679772 EST singleton (not in UniGene) with exon hit 0X88 

334359 CH22J=GENES.378_4 0.088 

30 335500 CH22.FGENES.571J0 0.088 

329687 CH.14_p2g]|6117B56 0.088 

333654 CH22 FGENESJ240J 0.088 

324430 AA464016 EST cluster (not in UniGene) 0.088 

325999 CH.16jvsgi[5867073 0.089 

35 334832 CH22J=GENES.439J 0.089 

339115 CH22JDA59H1 8.GENSCAN.49-3 0.089 

300896 AI916902 Hs.213882 ESTs 0X89 

328784 CHX7Jisgi|5868309 0.089 

335044 CH22.FGENES480J 0.089 

40 329791 CH.14_p2gi|6469354 0.089 

333656 CH22_FGENES240_4 0X89 

326180 CH.17Jisgij5867211 0X89 

333391 CH22_FGENES.144_6 0X89 

338324 CH22_EM^C005500.GENSCAN^06>3 0.089 

45 305396 AA721052 EST singleton (not In UniGene) with exon hit 0.089 

337483 CH22.FGENES.795-7 0.09 

326424 CH.19_hsgij5867369 0X9 

306454 AA977992 EST singleton (not in UniGene) with exon hit 0.09 

338893 CH22 OJ32J1 0.GENSCAN.7-6 0X9 

50 327470 CHX2.hs 01(5867772 0.09 

333165 CH^_FGB^ESX1_7 0X9 

307155 AI186738 Hs.182426 ribosomal protein S2 0.09 

330717 AA233926 Hs23635 ESTs - 0.09 

335334 CH22_FGENES.535J0 0.09 

55 335907 CH22_FGENES.636_2 0.09 

333885 CH22_FGENES292J 0.09 

331034 N51868 Hs.31965 ESTs; Moderately simOar to 40S RIBOSOMAL 

PROTEIN 820 [H^apiens] 0.09 

304660 AA534416 Hs.162185 ESTs 0.09 

60 328217 CHX6 hsgi{5858096 0.091 

336068 CH22_FGENES.684_13 0.091 

302833 AA295381 Hs.44423 ESTs 0X91 

328668 CH.07_hsgl|5868254 0.091 

335309 CH22_FGENES532J 0.091 

65 338481 CH22_EM:AC005500.GENSCAN.377,5 0.091 

306286 AA936892 EST singleton (not In UniGene) with exon hit 0.091 

305070 AA639783 EST singleton (not in UniGene) with exon hit 0X91 

304870 AA594811 Hs.1 19122 ribosomal protBinU 3a 0.091 

303856 AA968589 H&944 glucose phosphate isomerase 0.091 
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329893 CH.15_p2gi|6525313 0.098 

326533 CH.19_hsgi|5867441 0.098 

334905 CH22_FGENES.452_20 0.098 
306347 AA961 144 EST singleton (not In UnlGene) with exon hit 0.098 

5 336676 CH2JLFGENES.434 0,098 

339166 CH22_DA59H18.GENSCAN.69-7 0.098 

335774 CH22.FGENES.607J0 0.098 

339216 CH2^FF113D11.QENSCAN.6-11 0.098 

335311 CH22_FGENES.532_4 0.098 

10 329632 CH.11_p2gi|6729060 0.098 

328595 CH.07_hsgi|5868224 0.098 

326928 CH.21 hs gij6456782 0.098 

315234 AI079680 Hs.120770 ESTs 0.098 

306082 AA908508 EST singleton (not In UniGene) with exon hit 0.098 

15 305710 AA826544 EST singleton (not in UniGene) with exon hit 0.098 

318540 T30280 EST duster (not in UnlGene) 0.099 

337553 CH22_C4G1.GENSCAN.2-1 0,099 

320951 AA344069 H&2Q2699 neurexophitin 4 0.099 

303845 T08033 EST duster (not in UniGene) with exon hit 0.099 

20 338981 CH22_DA59Hl8.GENSCAN.2-5 0.099 

321313 R87365 Hs26058 ESTs; Weakly similar to p532 [H^apiens] 0.099 

328348 CK07_hsgi|5868383 0.099 

332203 H49388 Hs.102082 EST 0.099 * 

301780 R07064 EST duster (not in UnlGene) with exon hit 0.099 

25 332095 AA608838 Hs.162681 EST 0.099 

3332Z7 CH22JGENES.107_5 0X199 

316442 AA760894 Hs.1 53023 ESTs 0.099 

326001 CH.16_hs gi]5867073 0.099 

334363 CH22.FGENES.378J1 0.099 

30 338895 CH22_DJ32M0.GENSCAN.9-2 0.099 

327460 CH.02_hsgij6004455 0.099 

332705 T59161 Hs.76293 thymosin; beta 10 0.1 

307806 A1351739 EST singleton (not In UniGene) with exon hit 0.1 

322800 F25037 Hs.225175 ESTs 0.1 

35 304918 AA602697 EST singleton (not in UniGene) with exon hit 0.1 

334327 CH22_FGENES.375_4 0.1 

318359 AI097439 Hs. 135548 ESTs 0.1 

326644 Ca20_hsg1|5867559 0.1 

334454 CH22_FGENES.388_3 0.1 

40 327959 CHiB_hs gi[5868210 ai 

323783 AA330586 Hs.131819 ESTs 0.1 

309198 AI955915 H&248038 major histocompatibility complex; class I; C 0.1 

339265 CH22 BA354I12.GEN SCAN. 10-3 0.1 

320576 AL049977 Hs. 162209 Homo sapiens mRNA; cDNA DKFZp564C122 

45 (from done DKFZp564C122) ai 

338132 CH22_BMC005500.GENSCAN.20O-2 0.1 

333163 CH22_FGENES.91_5 0.101 

337584 CH22_C20H12£ENSCAN.5-1 0.101 

307588 AI285535 EST singleton (not in UnlGene) with exon hit 0.101 

50 336969 CH2*_FGENES.378-2 0.101 

327535 CH.Q2_hsgi|6525279 0.101 

328732 CK07Jisgi|586B289 0.101 

336686 CH22_FGENES46-3 - 0.101 

335777 CH22_FGENESW_13 0.101 

55 332944 CH22_FGENES47_3 0.101 

333174 CH22.FGENES.95_1 0.101 

336380 CH22_FGENES£21_8 0.101 

330571 U60800 Hs.79089 sema domain; Immunoglobulin domain (lg); 

cytoplasmic domain; (semaphorin) 4D 0.101 

60 331789 AA398721 Hs.186749 ESTs 0.101 

338915 CH22_DJ32I10.GENSCAN.1M - 0.101 

334844 CH22_FGENES.439_24 0.101 

336642 CH22JGENES.234 0.101 

334906 CH22 FGENES.452.21 0.101 
65 333188 CH22JGENES.98_8 0.101 

300088 AW299993 EST duster (not in UniGene) with exon hit 0.101 

329373 CHX_hsgl|66B2537 0.102 

331120 R46576 Hs.23239 ESTs 0.102 

335856 CH22.FGBIES.628J 0.102 
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331888 AA431337 HsM>17 ESTs 0.102 

333154 CH22_FGENES.89_4 0.102 

335989 CH22_FGENES.655_2 0.102 

304385 AA235602 EST singleton (not in UniQene) with exon hit 0.102 

5 338016 CH2^EM^C005500.GENSCAN.133-1 0.102 

335190 CH22 FGENES.507J5 0.102 

318595 T39486 Hs£137 ESTs 0.102 

333897 CH22_FGENES.250J1 0.102 

306526 AA989713 EST singleton (not in UniGene) with exon hit 0.103 

10 328734 CH.07_hsgi]586B289 0.103 

307294 AI205612 Hs.73742 ribosomal protein; large; PO 0.103 

327424 CH.02_hsg[|5867751 " 0.103 

335872 CH2a_FGENES.630_3 0.103 

333572 CH22_FGENES.189_1 0.103 

15 334774 CH22_FGENES.430_6 0.103 

338660 CH22 EM :ACGO55O0.G EN SCAN .462-1 0.103 

326713 CH20_hsQ$m7S95 0.103 

333994 CH22_FGENES.310_1 8 0.103 

335800 CH22_FGENES.613_4 0.103 

20 318113 AI187943 Hs,132322 ESTs 0.103 

337278 CH22.FGENES.665-1 0.103 

336386 CH22_FGENES322_6 0.103 . 

334790 CH22.FGENES432J5 0.103 " 

303778 AW505368 EST cluster (not in UniGene) with exon hit 0.104 

25 336524 CH22_FGENES.839_5 0.104 

328936 CH.08Jisgij5868500 0.104 

335102 CH22_FGB^ES.494_7 0.104 

300935 AA513644 H&222815 ESTs; Weakly similar to Wiskott-Aldrich Syndrome 

protein IRsapiens] 0.104 

30 307581 AI284415 EST singleton (not in UniGene} with exon hit 0.104 

317301 AW291683 Hs.226056 ESTs 0.104 

335330 CH22J=GENES.535_3 0.104 

337968 CH22_EM*^C005500.GENSCAN.103-2 0.104 

335627 " CH22J=GENES584_7 0.104 

35 336274 CH22_FGENES.762J2 0.104 

334730 CH22LFGBIES.424J5 0.105 

334409 CH22_FGENES.383_6 0.105 

327237 CR0Lhsgi]5867544 0.105 

333321 CH22_FGBES.138_13 0.105 

40 303181 AA452366 EST cluster (not in UniGene) wffli exon hit 0.105 

333738 CH22JH3ENES.261_2 0.105 

338255 CH22_EWLAC005500.GENSCAN276-3 0.105 

334282 CH22.JGENES.369J2 0.105 

330190 CHD5_p2 gi|6165182 0.105 

45 310748 AW014249 Hs.158698 ESTs 0.105 

338150 CH22_EMAC005500.GENSCAN.207-2 0.105 

336719 CH22_FGENES.82-6 0.105 

330228 CH.05_p2gi|6013527 0.105 

327801 CH.Q5JIS 015867924 0.105 

50 330525 S75168 H&274 megakaiyocyte-assodated tyrosine kinase 0.105 

334972 CH22.FGBiES.468_? 0.105 

335111 CH22LFGENES.494J9 0.106 

334483 CH22_FGB4ES.395_5 * 0.106 

328829 CH.07.hsgi|5868337 0.106 

55 302753 M74299 EST duster (not in UniGene) with exon hit 0.106 

334512 CH22JFGBJES.398J0 0.106 

330024 CH.16j>2giI6671908 0.106 

321030 AJ769930 Hs.233617 Homo sapiens (ctone B3B3E13) Huntington's 

disease candidate region 0.107 

60 338410 CH22_EMAC005500.GENSCAN^41-6 0.107 

334353 CH22 FGENES.376_5 0.107 

338276 CH22 EM:AC005500.GENSCAN.288* 0.107 

329053 CHX_hs gi|5868574 0.107 

336560 CH22_FGENES.842_5 0.107 

65 332158 AA621363 Hs.112980 EST 0.107 

336447 CH22_FGENES.829_4 0.107 

333703 CH22LFGENES250J7 0.107 

326207 CH.17_hsgi|5867222 ai07 

333232 CH22_FGENES.108_1 0.107 
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334802 CH22_FGENES.435 1 0.107 

303784 AA704983 EST duster (not In UniGene) with exon hit 0.107 

338847 CH22_DJ246D7.GENSCAN.10-2 0.107 

339407 CH22_DJ579N16.QENSCAN.1-9 0.108 

5 337635 CH22_C20H12.GENSCAN.32-8 0.108 

334650 CH2S_.FGENES.417 17 0.108 

308511 AI687580 EST singleton (not In UniGene) with exon hit 0.108 

333392 CH22.FGENES.144J 0.108 

325840 CH.16_hsgi]6552452 0.108 

10 315044 AW205664 Hs.129566 ESTs 0.108 

333298 CH22_FGENES.133 4 0.108 

335157 CH22_FGENES.501J 0.108 

333305 CH23_FGBIES.137_2 0,108 

326379 CH.19_hsgl|5867327 0.108 

15 335050 CH22_FGENES.482_1 0.108 

3051B5 AA663985 Hjl248038 major histocompatibility complex; class I; C 0.108 

335658 CH22_FGENESi90_9 0.108 

323040 AA336609 Hs.10862 ESTs 0.108 

337326 CH22_FGENES.63^6 0.108 

20 339262 CH22_BA354l12.GENSCAN.9-6 0.108 

321202 H540S2 Hs.163639 ESTs; Weakly simflar to INTERCELLULAR ADHESION 

MOLECULE-1 PRECURSOR {H sapiens] 0.109 

331792 AA398968 Hs£7548 EST 0.109 * 

333806 CH22_FGBIES278_2 0.109 

25 321325 AB033100 EST duster (not in UnlGene) 0.109 

331373 AA435513 Hs.178170 ESTs; Weaidy similar to DUAL SPECIFICITY 

PROTEIN PHOSPHATASE 3 0*7 

328775 CH07Jtsgi[5668309 0.109 

335105 CH22_FGENES.494J0 0.109 

30 300975 A1283548 Hs.149668 ESTs 0.109 

324893 T31940 EST cluster (not in UnlGene) 0.109 

333397 CH22.FGENES.144 15 0.109 

336484 CH22 FGENES.831 3 0.109 

335507 CH22_FGENES.571_22 0.109 

35 336373 CH22.FGENES320.3 0.109 

338188 CH22.FGENES.717 12 0.109 

313455 AW081702 Hs. 137329 ESTs 0.109 

335185 CH22.FGENES.5Q6 4 0.109 

306814 AJ06S577 EST singleton (not in UnlGene) with exon hit 0.109 

40 311130 AK32322 Hs.195306 ESTs 0.109 

310882 AW060339 Hs211911 ESTs 0.109 

323383 AI346359 Hs, 135209 ESTs 0.11 

300212 AW135925 Hs.184552 biphenyihydrolase-like (serine hydrolase; breast epithelial 

mudn-assoc 0.11 

45 325675 CH.14_hsgq5B67014 0.11 

330095 CR19 _p2gl|6015278 . 0.11 

331942 AA453261 Hs39309 ESTs 0.11 

334723 CH22_FGENES.421_34 0.11 

333614 CH2$_FGENES.217_0 0.11 

50 337316 CH2S.FGENES.692-1 0.11 

305057 AA635626 Hs.62954 ferritin; heavy polypeptide 1 0.11 

338704 CH2£_EM:AC005500.GENSCAN.480-3 0.11 

335385 CH2^_FGENES.543_27 * 0.11 

338012 CH22_EM^C005500.GENSCAN.128-10 0.11 

55 329449 CH.Y_hsgp68886 0.11 

338980 CH22_DA59H18.GENSCAN_M 0.11 

336553 CH22.FGENES.841J0 0.111 

330021 CH.16_p2gi|6671889 0.111 

327579 CHJHLhs gl|5867824 0.11 1 

60 333099 CH22J=GENES.79_4 0.111 

337076 CH22_FGENES.4534 0.111 

331388 AA456852 Hs.43543 suwressorofwhlteapriwthomolog2 0.111 

306674 AI005542 Hs.180414 heatshock70kD protein 10 (HSC71) 0.111 

305949 AA884409 EST singleton (not in UniGene) with exon hit 0.111 

65 330748 AA419217 Hs.15911 DKFZP586E1 422 protein 0.111 

333780 CH22_FGENES.273_2 0.111 

323676 AI702835 EST cluster (not in UniGene) 0.111 

308952 A1868157 Hs£24226 EST 0.111 

309338 AW026946 Hs.181 165 eukaryofic translation elongation factor 1 alpha 1 0.1 1 1 
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329317 CHX.hsglJ638t976 0.112 

333518 CH22_FGENES.173J 0.112 

306982 A1127883 EST singleton (not in UniGene) wfth exon hit 0.112 

338225 CH22 FGENES.728J 0.112 

5 333688 CH22.FGENES.250J2 0.1 12 

302173 AI417947 Hs.14068 ESTs 0.112 

335510 CH22 FGENES.571J5 0.112 

328042 CH.06 JiS Qi|5902482 0.1 12 

336512 CH22.FGENES.834J 0.112 

10 328541 CH.07J1S gi|586B486 0.112 

311265 AW205118 Hs.199214 ESTs 0.112 

323218 AF131846 Hs.13396 Homo sapiens done 25028 mRNA sequence 0.112 

302002 AF013956 Hs.123085 chromobox homotog 4 (DrosophUa Pc class) 0.112 

315088 AA557351 Hs.152448 ESTs; Moderately similar to MULTIFUNCTIONAL PROTEIN ADE2 0.112 

15 312581 AI937242 Hs.176590 ESTs 0.112 

322246 AW384710 Hs.125258 ESTs 0.112 

333659 CH22_FGENES^41_5 0.113 

327510 CHXK_hsgl|6117815 0.113 

336520 CH3LFGENES.839J 0.113 

20 338682 CH223A:ACOa5500.GENSCAN.472-1 0.113 

334508 CH22_FGENES-398_6 0.113 

322533 T59538 EST cluster (not in UniGene) 0.113 

306873 AKJ86929 EST singleton (not In UniGene) with exon hit 0.113 ~ 

336040 CH22_FGENES.679_2 0.113 

25 303898 T23215 EST cluster (not In UniGene) with exon hit 0.113 

312011 AW294868 Hs.187226 ESTs 0.113 

335186 CH22_FGENESi06_5 0.113 

333607 CH22_FGENES.216_2 0.113 

305549 AA773530 EST singleton (not In UniGene) with exon hit 0.113 

30 333686 CH22_FGENES.249_4 0.113 

334352 CH22_FGENES,376 3 0.113 

338195 CH22_EMAC005500.GENSCAN^33.18 0.114 

333588 CH22_FGENES.206_2 0.114 

339233 CH22_BA354M2.GENSCAN.2-3 0.114 

35 337455 CH22.FGENES.77M 0.114 

309101 A1925108 EST singleton (not in UniGene) with exon hit 0.114 

328522 " CH.07_hsgi|5868477 0.114 

323999 AI537333 Hs^52782 ESTs 0.114 

333517 CH22_FGENES.173_2 0.114 

40 329935 CH.16_p2gl|6165200 0.114 

326226 CH.17_hsgi|5867230 0.114 

335890 CH22_FGBIES.633_4 0.114 

338715 CH2^FGENES.77-1 0.114 

327640 CH04_hS8l|5857890 0.114 

45 338842 CH22_DJ246D7.GENSCAN.7-1 0.114 

306534 AA991487 EST stngteton (not in UniGene) with exon hit 0.114 

336597 CH22.FGENES.266J 0.114 

321010 Y17456 Hs227150 Homo sapiens LSFR2 gene; test exon 0.114 

302294 AA159213 H&5337 isodtrate dehydrogenase 2 (NADP+); mitochondrial 0.1 14 

50 324895 N44238 Hs.77515 inositol 1;4;5^hasphate receptor, type 3 0.114 

327358 CH.01_hsgl[6552411 0.114 

308792 AI815153 Hs.195188 glyceraidehyde^phosphatB dehydrogenase 0.115 

325886 CH.16Jtsgi|5867087 - 0.115 

336850 C822J=GENES272-11 0.115 

55 305858 AA863103 EST singleton (not In UniGene) with exon hit 0.115 

302569 AC004472 multiple UniGena matches 0.115 

338158 CH22_FGENES.707J> 0.115 

527866 CH.06_hsgi|5868131 0.115 

339157 CH22_DA59H18.GENSCAN.67-3 0.115 

60 339258 CH22_BA354l12.GENSCAN.8-3 0.115 

336129 CH2a_FGENES.701J7 0.115 

333684 CH22.FGENES.249J 0.115 

309618 AW190162 Hs.184776 ribosomal protein L23a 0.115 

312926 AA954097 Hs.127523 ESTs 0.115 

65 302640 AB035698 EST cluster (not In UniGene) with exon hit 0.115 

328968 CK08_hsgi|6456775 0.115 

327902 CH.06Jisgil5868158 0.115 

321927 AJ223366 EST cluster (not In UniGene) 0.115 

335962 CH22_FGENES.651_4 0.115 
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334927 CH22.FGENES.460J 0.115 
330535 U11872 Human lrrterieukin-8 receptor type B (IL8RB) mRNA, 

spOoe variant IL8RB1 0.656 

328591 CH.07JS Ql|5868227 0.1 15 

5 334902 CH22_FGENES.452_16 0.115 

328525 CH.07_hsgi|5868482 0.115 

325870 CH.16_hsgi|6682492 0.116 

337522 CH22 FGENES.819-1 0.116 

305079 AA641329 EST singleton (not In UniGene) with exon hit 0.116 

10 327343 CHjOLhsgi|6017017 0.116 

333918 CH22_FGENES.296_7 0.116 

333600 CH22 FGENES.213J 0.116 

335846 CH22.FGENES.623J 0.116 

333510 CH22_FGENES.171J 0.116 

15 327629 CR04_hsgi|5867872 0.116 

333470 CH22.FGmES.161J 0.116 

326855 CR20_hsgi|6552460 0.116 

327008 CH2tJrsgi|5867664 0.117 

337480 CH22.FGENES.795-3 0.117 

20 336425 CH22.FGENES.S24J0 0.117 

321964 AL079687 Hs.171065 ESTs 0.11* 

335651 CH22_FGENES.590_2 0.117 

308164 AI521574 Hs.181 165 eukaiyotlc lianslafion ekmgalton factor 1 alpha 1 0.117 ~~ 

337927 CH22_EM:AC005500.GENSCAN.80-3 0.117 

25 300341 H45095 Hs.153524 ESTs ai17 

300154 AE45127 Hs. 179331 ESTs 0.117 

306295 AA937331 EST singleton (not In UniGene) with exon hit 0.117 

329670 CR14_p2gi|6272129 0.117 

335612 CH22J=GENES.583 6 0.117 

30 307845 AI363450 EST singleton (not in UniGene) with exon hit 0.117 
330401 D28383 Human mRNA for ATP synthase B chain, 5UTR (sequence from the 

5'cap to the start codon) 0.117 

327127 CH.21_hsgl|6682520 0.117 

333843 CH22.FGENES.280J 0.117 

35 331083 R17762 Hs22292 ESTs 0.117 

329140 CHJLhsgp17060 0.117 

339338 CH22_BA354lt2.GENSCAN27-3 0.117 

331974 AA464518 Hs.99616 ESTs 0.117 

338631 CH2a_EMAC005500.GENSCAN.454-2 0.117 

40 330299 CH.06_p2gi|2905881 0.117 

330351 CH.09_p2gi|3056622 0.117 

305377 AA715714 Hs.181357 laminin receptor 1 (67kD; rfoosomal protein SA) 0.117 

333106 CH22_FGENES.79_12 0.117 

338514 CH22_EV,-AC005500.GENSCAN^24 0.117 

45 327335 CH.01Jisgi|5902477 0.117 

301970 AB028962 Hs.120245 WAA1 039 protein 0.118 

326339 CH.17_hsgtj6Q56311 0.118 

330612 X15673 Hs.93174 Human endogenous retrovirus pHE.1 (ERV9) 0.118 

33417B CH22_FGBJES.350_6 0.118 

50 328008 CH.06Jisgi|5902482 0.118 

329976 CH.16_p2g|4878063 0.118 

320952 AA897432 Hs. 130411 ESTs 0.118 

305621 AA789095 EST singleton (not in UniGene) with exon hit - 0.118 

337850 CH22 E1MC005500.GENSCAN.34-3 0.118 

55 333626 CH22.FGENES.224J 0.118 

337672 Of22„EM^C»00097.GENSCAN.67-1 0.118 

328603 CH.07 hsgi|6004475 0.118 

325922 CH.16Jisgi|5867122 0.118 

334489 CK22_FGENES.397_1 0.118 

60 320638 R54766 Hs.101120 ESTs 0.118 

321932 AA569229 EST duster (not in UniGene) 0.118 

336958 CH22.FGENES.367-1 0.118 

332082 AAB00176 Hs.1 12345 ESTs 0.118 

306004 AA889992 EST singleton (not in UniGene) with exon hit 0.118 

65 336803 CH22.FGENES.194-1 0.118 

309107 AI925823 EST singleton (not in UniGene) with exon hit 0.118 

336859 CH22^FGENES^93^ 0.118 

337935 CH22_EMAC005500.GENSCAN^5-6 0.118 

326492 CH.19_hsgi|5867422 0.118 

239 



WO 02/30268 



PCT/US01/32045 



327289 CH.01Jisgi|5867481 0.119 

325818 CH.14Jsgi|6682490 0.119 

310787 AW262580 Hs.159040 ESTs 0.119 

330028 CH.16j>2gi|6671808 0.119 

5 325317 CH.11_hsgi|5866878 0.119 

335279 CH22.FGENES.523J 0.119 

331720 AA192173 Hs321530 ESTs 0.119 

329186 OiX_hsgl|586B711 0.119 

316012 AA764950 Hs.1 19898 ESTs 0.119 

10 338316 CH22_ai:AGOQ5500.GENSCAN.304-2 0.119 

326033 CH.17JS gi]586717B 0.1 19 

334745 CH22.FGENES.426J 0.119 

333051 CH22.FGENES.73..5 0.119 

301763 R01279 EST cluster (not in UniGene) with exon hit 0.12 

15 304502 AA454809 Hs.1 72928 coPagsn; type I; alpha 1 0.12 

335680 CH22_FGBJES.594_5 0.12 

304678 AA548556 EST singleton (not in UniGene) with exon hit 0.12 

335441 CH22_FGENES560_4 0.12 

336187 CH22_FGBJES.717_11 0.12 

20 309422 AW087175 EST singleton (not in UniQene) with exon hit 0.12 

336047 CH22LFGENES.679.9 0.12 

309651 AW195850 EST singleton (not in UniGene) with exon hit 0.12 

308547 AI695385 Hs£01903 EST 0.12 

304443 AA399444 EST singleton (not in UniGene) with exon hit 0.12 

25 336245 CH2?_FGENE$.746_3 0.12 

302703 K72333 EST cluster (not In UniGene) with exon hit 0.12 

335690 CH22^FGENES596.5 0.12 

328941 CH.Q8_hsgiI6456765 0.12 

333873 CH22.FGB1ES.291J 0.12 

30 317246 AW105092 Hs.155690 ESTs 0.12 

339288 CH22_BA354l12.GENSCAN.16-6 0.12 

337996 CH22_EMAC005500.GENSCAN.116-3 0.12 

333304 CH22J=GB^ES.137J 0.121 

308332 AI591235 EST singleton (not in UniGene) with exon hit 0.121 

35 329319 CHX.hsgi|6381976 0.121 

302086 X57138 multiple UniGene matches 0.121 

333290 CH22_FGENE$.129_2 0.121 

323825 AI793080 Hs.123525 ESTs; Weakly similar to NEUTROPHIL GELATINASE-ASSOCIATED 

UPOCAUN PRECURSOR [Rjiorvegicus] 0.121 

40 330575 U64105 Hs252280 Rho guanine nucleotide exchange factor (GEF) 1 0.121 

305274 AA679990 Hs.1 81 1 65 eukaryotic translation elongation factor 1 alpha 1 0.121 

333647 CH22_.FGENES.235J 0.121 

302251 AA333340 EST cluster (not in UniGene) with exon hit 0.121 

329777 CR14j>2gi|60Q2090 0.121 

45 333155 CH22J=GENES.89_5 0.121 

326122 CH.17Jisgi|5867194 0.121 

335310 CH22_FGENES532_3 0.121 

335453 CH22.FGENES.562J3 0.122 

305103 AA643329 Hs.1 11 334 ferritin; light polypeptide 0.122 

50 337284 CH22_FGENES.667^ 0.122 

337418 CH22_FGENES.758-4 0.122 

313073 AI963740 Hs.46826 ESTs 0.122 

303759 AW504164 EST cluster (not in UniGene) with exon hit - 0.122 

„ 300017 

55 M33197 AFFX control: GAPDH 0.122 

316725 AW135084 Hs.127264 ESTs 0.122 

330738 AA293153 Hs.120980 nuclear receptor co-repressor 2 0.122 

336466 CH22J=GENES.829_25 0.122 

335956 CH22 FGENES.647 3 0.122 

60 315308 AA780564 Hs.189053 ESTs 0.122 

338925 CH22J5J32I10.GENSCAN.14-3 0.122 

334969 CH22_FGENES.466_2 0.122 

322050 All 37589 EST cluster (not in UniGene) 0.122 

339084 CH22_DA59H18.GENSCAN.38-2 0.122 

65 338323 CH22_EM^C005500.GENSCAN.306-2 0.122 

337003 CH22_FGENES.419-7 0.122 

325470 CH.12Jisgi|6017Q34 0.123 

336503 CH22_FGENES.833J0 0.123 

330786 D60374 Hs.258712 EST 0.123 
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329446 CH.YJlS0i|5868886 0.123 

303326 AA229433 HSJ222634 ESTs; Moderately similar to ubiquMke protein/ 

ribosomal protein S30 0.123 

309067 A1916313 Ks212788 EST 0.123 

5 317464 AA968472 Hs.130463 ESTs • 0.123 

328755 CH.07_hs gi|5868301 0.123 

326036 CH.17Jwgi|5867178 0.123 

327208 CH.0Lhsgi|5867447 0.123 

326124 CH.17_hsgi)5916395 0.123 

10 327509 CH.02_hsg?61 17815 0.123 

338398 CH22„EM^C005500.QENSCAN.33^5 0.123 

304652 AA527782 H&84298 C074 antigen (invariant polypeptide of major 

histooornpatibffity complex; class II antigen-associated) 0.123 

335797 CH22J=GENES.612_6 0.124 

15 336714 CH22.FGENES.76-29 0.124 

327204 CH.01_hsgi|5867447 0.124 

331881 AA430672 Hs.123778 ESTa 0.124 

306971 AI126509 EST singleton {not in UniGene) with excn hit 0.124 

336174 CH22.FGENES.710J 0.124 

20 336126 CH22_FGENES.701_13 0.124 

329129 CKX_hsgil6588026 0.124 

303049 AW407562 EST cluster (not m UniGene) with exon hit 0.124 

335778 CH22_FGENES.6Q7_14 0.124 * 

336601 CH2£_FGENES.369j2 0.124 

25 334340 CH22J=GB€S.375J7 0.124 

337436 CH22_FGENES.767-1 0.124 

306013 AA896990 EST singleton (not in UniGene) with exon hit 0.124 

339213 CH22_FF113D11.GENSCAN.6-8 0.124 

335355 CH22_FGENES.541_2 0.124 

30 336552 CH22_FGENES.841J 0.124 

336384 CH22_FGENES.822_4 0.124 
310485 AI2862Q2 Hs.149800 ESTs 0.125 
335840 CH22.FGENES.622J 0.125 
336444 CH22_FGENES.827J0 0.125 

35 315703 N36070 EST cluster (not in UniGene) 0.125 

327763 CH.05 hsgi|5867961 0.125 

336383 CH22_FGENES£22_3 0.125 

333496 CH22_.FGBJES.168J 0.125 

328662 CH.07_hsgi|6004473 0.125 

40 338986 CH22_DA59H18.GENSCAN£-1 0.125 

328311 CHJ07JiSfl!|586B371 0.125 

337241 CH22_FGENES£44-2 0.125 

336933 CH22_FGBIES.350-7 0.125 

313483 AW294432 Hs.144252 ESTs 0.125 

45 326116 CH.17Jis $5867193 0.125 

330450 HG363-HT363 Epidermal Growth Factor Receptor-Related Protein 0.125 

307491 AI268539 EST singleton (not in UniGene) with exon hft 0.125 

331852 AA416988 Hs£8314 Homo sapiens mRNA;cDNADKFZp586L0120 

(fromctoneDKFZp5B6l0120) 0.125 

50 330462 HG944-HT944 Dopamine Receptor D4 0.125 

304410 AA284508 EST singleton (not in UniGene) with exon hit 0.125 

336385 CK22_FGENES.822_5 0.125 
336793 CH22_FGENES.176-3 - 0.125 
326243 CH.17Jisgi|5867281 0.125 

55 327266 CK01_hsgI|58674e2 0.125 

320753 AF070579 Hs.181544 Homo sapiens clone 24487 mRNA sequence 0.125 

336960 CH22.FGENES.369-5 0.125 

329667 CH,14j)2 gi|6272129 0.125 

328168 CH.06_hsgr|5868071 0.125 

60 336534 CH22.FGENES.839J6 0.125 

339289 CH22_BA354l12.GENSCAN.1fr9 0.126 

309230 AI970747 EST singleton (not in UniGene) with exon hit 0.126 

339190 CH2^FF113D11.GENSCAN.1-2 0.126 

337086 CH22_FGENES.458-14 0.126 

65 319233 R21054 Hs.211522 ESTs 0.126 

339396 CH22_BA232E17.GENSCAN.6-8 0.126 

331930 AA449077 Hs.179765 Homo sapiens mRNA; cONA DKFZp586H1921 

(from clone DKF2p586H192 0.126 

308099 AI475914 EST singleton (not In UniGene) with exon hit 0.126 
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338477 CH22.EMAC005500.GBNSCAN373^ 0.126 

334286 CH22L.FGENE5.369J6 0.126 

317245 AI025039 Hs.131732 ESTs 0.126 

335249 CH22_FGENES.516_10 0.126 

5 333327 CH22.FGENES.138J0 0.126 

304240 AA009802 EST singleton (not in UniGene) with exon hft 0.126 

335464 CH22J=GENES.562_26 0.126 

335236 CH22_FGENES.515J 0.126 

334154 CH22_FGENES.340_4 0.126 

10 309257 AJ984183 EST singleton (not in UniGene) with exon hit 0.126 

310015 A1220122 H&201981 ESTs; Weakly similar to breast carcinoma-associated antigen 

[H-sapiensj 0.126 

328280 CH.07.hsgi|5868352 0.126 

305744 AA831819 EST singleton (not in UniGene) with exon hit 0.126 

15 327430 CR02Jsgi|5867754 0.126 

328323 CHQ7Jisgi|5868373 0.126 

333274 CH22_FGENES.123_2 0.126 

337193 CH22.FGENES.575-3 0.127 

334820 CH22_FGENES.437_2 0.127 

20 328706 CH.07Jisgi|5868270 0.127 

331228 W67267 Hs.174911 ESTs 0.127 

307205 AI192479 EST singleton (not in UniGene) with exon hft 0.127 

337123 CH2ajGENES.519-3 0.127 * 

326201 CH.17Jisgi)5867216 0.127 

25 335276 CH22_FGENES.523_2 0.127 

331202 TB1115 Hs.191136 ESTs 0.127 

330532 U03187 Hs.121544 interieukin 12 receptor; beta 1 0.127 

321235 N49521 EST duster (not in UniGene) 0.127 

301743 F12605 Hs£04529 ESTs; Weakly similar to reverse transcriptase [H^apiens] 0.127 

30 328175 CH.06_hsgi|5868073 0.127 

306407 AA971985 EST sbigteton (not In UniGene) with exon hit 0.127 

327145 CH.01_hsgi|5867548 0.127 

327649 CH.04JisgI]5867899 0.127 

335142 CH22J=GENES.498_12 0.127 

35 333909 CH22JH3ENES.295J 0.127 

330608 X04325 H&2679 gap Junction protein; beta 1;32kD(oonnexin 32; 

Charcot-Marie-Tooth neuropathy; X-llnked) 0.127 

330158 CH21jp2gi|6580367 0.127 

320153 AF064594 Hs.120360 phosphoRpaseA2; group VI 0.127 

40 314407 AA098835 Hs224432 ESTs 0.127 

333383 CH22J=GENES.143_22 0.127 

£0663 AI734242 Hs^44473 ESTs 0.128 

326233 CH.17JB g(5887232 0.128 

326598 CR20Jisgi|5867634 0.128 

45 335174 CH22_FGENES.504_4 0.128 

319843 H29920 Hs.99486 ESTs; Weakly similar to aralarl [H^apfens] 0.128 

335458 CH22_FGENESi62J8 0.128 

332997 CH22_FGENES^8_4 0.128 

334188 CH22.FGENES552_3 0.128 

50 329759 CH.14j>2gi|6048280 0.128 

330348 CH.09JJ2 gi|4544475 0.128 

326958 CHJ21Jwgi|846983B 0.126 

305263 AA679467 EST singleton (not in UniGene) with exon hit - 0.126 

337693 CH22.EMAC000097.GENSCAN.78-14 0.128 

55 326812 CR20_hsgi|6682504 0.126 

333237 CK22 FGENES.108 7 0.128 

333699 CH22_FGENES250 13 0.128 

311496 AI768677 Hs.209888 ESTs; Weakly similar to phosphatidyiserine 

synthase-2 [Mmisculus] 0.128 

60 336499 CH22_FGENES.833_4 0.128 

320087 AF032387 Hs.113265 small nuctear RNA activating complex; polypeptide 4; 190kD 0.128 

309989 AI184186 Hs.197813 ESTs 0.128 

301490 AW298468 Hs.250461 ESTs 0.128 

337011 CH22_FGENES.427-6 0.128 

65 315052 AA876910 Hs.134427 ESTs 0.128 

301611 W22172 Hs.59038 ESTs 0.128 

336497 CH22_FGENES.833_2 0.129 

302068 Y16280 Hs.132049 endothelin type b receptor-like protein 2 0.129 

334502 CH22.FGENES.397J8 0.129 
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304332 AA158884 EST singleton (not in UniQene) with exon hit 0.129 

304522 AA465405 EST singleton (not in UniGene) with exon hit 0.129 

312407 R46180 Hs.153485 ESTs 0.129 

310098 AI685841 Hs.161354 ESTs 0.129 

301119 AF142579 EST duster (not in UniGene) with exon hit 0.129 

309258 AI9B5821 H&62954 ferritin; heavy polypeptide 1 0.129 
330989 H42142 HsJ>26396 DEAD/H (Asp-Glu-Ala-Asp/His) box polypeptide 19 

(Dbp5; yeast; homotog) 0.129 

336949 CH22_FGENES.361-4 0.129 

330115 CH.19jJ2gi|6015202 0.129 
339212 CH22_FF113D11.GENSCAN.6-7 0.129 
326951 CH21_hsgi|6004446 0.129 

305165 AA662939 EST singleton (not in UniQene) with exon hit 0.129 
308238 AI559492 EST singleton (not in UniQene) with exon hit 0.129 
337140 CH22_FGENES-537-5 0.13 
321758 U29112 EST cluster (not in UniGene) 0.13 
304619 AA515554 Hs. 119598 rfcosomal protein L3 0.13 
312469 AA745289 Hs. 173088 ESTs 0.13 
339017 CH22_DA59H18.GENSCAN.2<tt 0.13 

330116 CH.19_p2g][6015202 0.13 
333312 • CH22J=GENES.138 4 0.13 
338004 CH22 EM^C005500.GENSCAN.121-1 0.13 
314141 AA232134 Hs.190028 ESTs 0.13 
300509 AI239845 Hs.128494 ESTs; Weakry similar to EG:95B7.2 [D.melanogaster] 0.13 

CH22_EMAC005500.GENSCAN39B-11 0.13 

CH22JGENES.652J 0.13 

ESTs 0.13 

CH22_C20H12.GENSCAN.6-8 0.13 

CH22JGENES.33J 0.13 

EST singleton (not in UniGene) with exon hit 0.13 

CH22_DA59H18.GENSCAN.30-5 0.13 

EST singleton (not in UniQene) with exon hit 0.13 

EST singleton (not in UniQene) with exon hit 0.1 3 

CH,02_hsgi]5867750 0.13 

Hs.164178 ESTs 0.13 

Hsifl9929 EST 0.13 

CH.16Jisg!|5867147 0.13 

Hs.197636 ESTs 0.13 

Hs.182135 ESTs 0.13 

CH22_FGENES.BZ7_7 0.13 

CH22J=GENES.814J2 0.13 

306911 AI095365 EST singleton (not In UniGene) with exon hit 0.13 

333613 CH22 FGENES.217 8 0.13 

338489 CH2£EMAC005500.GENSCAN384-17 0.131 

3269M CR21Jisgi|5B67684 0.131 

337337 CH22_FGENES.7t7-1 0.131 

328752 CR20_hsgi|5857615 0.131 

303977 AW512978 EST singleton (not in UniGene) wQh exon hit 0.131 

301373 AA595235 EST cluster (not in UniGene) with exon hit 0.131 

CH22_EM^C005500.GENSCANJ59-22 0.131 

CH22_FGB4ES272_5 0.131 

CH22_FGENES5L8 0.131 

CH22LFGENES.541J2 • 0.131 

CH22_FGENES.635_4 0.131 

337825 CH22_EM:AC005500.GENSCAN.13-19 0.131 

325257 CH.11_hsgi|5866895 0.131 

331188 T50240 Hs.167837 ESTs 0.131 

330645 Y083Q2 Hs. 144879 dual specificity phosphatase 9 0.131 

331760 AA292721 Hs.1 54434 ESTs; Weakly similar to unknown [H.sapiens] 0.131 

322995 AA513829 Hs.29797 rfoosornal protein L10 0.131 

335497 CH22_FGENES.571_5 0.131 

334824 CH22_FGENES.437_6 , 0.131 

318480 R06933 Hs.184221 ESTs 0.131 

334842 CH2^FGENES.439J1 0.131 

333335 CH22 FGENES.139 4 0.131 

317252 AA905178 Hs.130124 ESTs 0.131 

328034 CKX_hsgi|5868561 0.131 

305166 AA664230 EST singleton (not in UniGene) with exon hit 0.131 
335755 CH22_FGENES.604J 0.131 

243 



314121 
337593 
332881 
305836 
339059 
305610 
305852 
327409 
312751 
308726 
325961 
311159 
322715 
336441 



AI732100 Hs.187619 



AA858043 

AA782319 
AA862455 

AI613089 
AI799268 

AW025919 
AA057230 



333774 



WO 02/30268 



PCT/US01/32045 



302143 H15270 Hs.1 89847 putative neuronal cell adhesion molecule 0.131 

334939 CH2?_FGBIES.465_3 0.131 

318994 C15110 Hs.17802 ESTs 0.131 

334498 CH22LFGENES397J4 0.131 

5 333413 CH22J=GENES.146J 0.132 

329676 CH.14_p2gl|6272128 0.132 

327277 CHOIJhs 955867473 0.132 

305022 AA627416 EST singleton (not in UniGene) with exon hit 0.132 

336805 CH22 FGBIES.196-3 0.132 

10 320121 T93657 EST cluster (not in UniGene) 0.132 

334761 CH22_FGENES.428_1 0 0.132 

339400 CH22_BA232E17.GENSCAN.7-6 0.132 

330301 CH.06_p2 gi(2905862 0.132 

316822 AA627691 Hs.129967 ESTs; Weakly similar to neuronal thread protein 

15 AD7C-NTP [H^apiens] 0.132 

328020 CH.06_hsgi|5902482 0.132 

325327 CH.11Jisgi]5866875 0.132 

321163 AA209530 EST cluster (not in UniGene) 0.132 

336393 CH22_FGENES.823_5 0.132 

20 325905 CH.16_hsgQ5867104 0.132 

305237 AA676286 Hs.2186 eukaryotic translation elongation factor 1 gamma 0.132 

339046 CH22_DA59H18.GENSCAN.2B-6 0.132 

325375 CH.12_hsgI[5866920 0.132 ~ 

333961 CH22J=GENES.3047 0.132 

25 335450 CH22_FGENES.562_8 0.133 

302286 R58438 EST dustBr (not in UniGene) with exon hit 0.133 

335116 CH22_FGENES.496 3 0.133 

327333 CH.01_hsgl|5902477 0.133 

308070 AI470948 EST singleton (not in UniGene) wfih exon h3 0.133 

30 308311 AI581855 EST singleton (not in UniGene) with exon hit 0.133 

320813 AW360847 H&208839 ESTs 0.133 

323665 AW248307 EST cluster (not in UniGene) 0.133 

328318 Cfi07_hsgi|5868373 0.133 

320603 R51419 EST cluster (not in UniGene) 0.133 

35 332791 CH22_FGENES.3J 0.133 

314976 AA524725 Hs.162108 ESTs 0.133 

303309 AL134164 H&224868 ESTs 0.133 

320581 R39753 Hs.170187 ESTs 0.133 

333944 CH22_FGBJES3Q2j2 0.133 

40 317992 AI73ffi12 Hs.1 30901 ESTs 0.133 

330935 FQ2383 Hs26492 beta-1;3-glucurc^ltransf erase 3 (grucuronosyitransferase I) 0.133 

336659 CH22_FG»JES.3*5 0.133 

338887 CH22_DJ32J10.GENSCAN.6-1 0 0.133 

305273 AA879979 Hs.1 81 165 eukaryotic translation elongation factor 1 alpha 1 0.133 

45 333566 CH22_FGENES.183_2 0.134 

316952 AW450033 Hs.163312 ESTs ai34 

333818 CH2LFGENES.283J 0.134 

328687 CH.07Jisgi|5868262 0.134 

302879 H11802 EST duster (not in UniGene) with exon hit 0.134 

50 336557 CH22_FGENES.842_2 0.134 

335222 CH22_FGENES.513 5 0.134 

338094 CH22_EM^C005500.GENSCAN.17W 0.134 

337384 CH22.FGENES.745-1 - 0.134 

327360 CH.01Jisgi|6552411 0.134 

55 328132 CR06Jisgi|5868038 0.134 

323604 AI751438 Hs.182827 ESTs; Weakly similar to mi ALU SUBFAMILY SQ 

WARNING ENTRY UJl 0.134 

337591 CH22_C20H12.GENSCAN.fr6 0.134 

307018 AI140639 EST singleton (not in UniGene) with exon hit 0.134 

60 326896 CH21_hsgi}5867680 0.134 

333479 . CH22_FGBIES.163_5 0.134 

337915 Cr^EM:AC005500.GENSCAN.61-3 0.134 

335110 CH22.FGENES.494J8 0.134 

333481 CH22.FGENES.163J9 0.134 

65 327512 CH.Q2_hsgiI6117815 0.134 

300096 AW328639 Hs.83575 ESTs; Weakly similar to ZC328.3 [C.elegans] 0.134 

330163 CH02_p2gi|6042042 0.135 

335752 CH22_FGENES.604_1 0.135 

334857 CH22_FGENES.443_1 0.135 
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301872 HB4730 EST cluster (not in UniGene) with axon hit 0.135 

337529 . CH22_FGENES.823.29 0.135 

335734 CH22_FGENES.601_4 0.135 

337551 CH22_FGENES.847-8 0.135 

5 309078 AI920965 Hs.77961 major histocompatibility complex; class I; B 0.135 

335513 CH22_FGENES.571_28 0.135 

339078 CH22J)A59H18.GENSGAN.37-6 0.135 

321907 N56660 Hs.148722 ESTs; Weakly similar to large tumor suppressor lIH^apiens] 0.135 

337189 CH22_FGENES.571-32 0.135 

10 329635 CH.12_p2gi|53Q2817 0.135 

308601 A1719930 EST singleton (not in UniGene) with exon hit 0.135 

305Q20 AA627248 Hs.2064 vimentin 0.135 

333894 CH22.FGENES.293J 0.135 

322465 AA137152 Hs.3784 ESTs; Highly stmBar to phosphoserine aminotransferase 

IS [H.sapiens] 0.135 

305601 AA780975 EST singleton (not In UniGene) with exon hit 0.135 

332186 H10781 Hs.141051 ESTs; Moderately similar to U!l ALU SUBFAMILY SB 

WARNING ENTRY 0.135 

327822 CHj05Jisfli|5867868 0.135 

20 310087 AI393914 Hs.160624 ESTs; Weakly similar to simDar to CR16; SH3 domain 

binding protein 0.135 

328752 Cri07Jisgi|5868298 0.135 . 

337611 CH22_C20H12.GENSCAN.194 0.135 

334470 CH22_FGENES.394_1 0.136 

25 335115 CH22_FGENES.496_2 0.136 

328730 CH.07_hs gi|5866289 0.136 

330350 CH.09_p2gi|3Q56622 0.136 

336971 CH22.FGENES.378-6 0.136 

308258 AI565612 EST singleton (not In UniGene) with exon hit 0.136 

30 326745 CR20_hsgi|5867611 0.136 

335440 CH22_FGENES.560_3 0.136 

320257 AA330746 EST cluster (not in UniGene) 0.136 

328677 CHj07_hsgf|5868256 0.136 

329731 CH.14_p2gi|6065783 0.136 

35 315950 AA700553 Hs£06974 ESTs 0.136 

330049 CH.17ji2gi|4567182 0.136 

337070 CH22_FGENES.448-3 0.136 

304095 H11324 K&31059 EST 0.136 

309304 AW005527 HSJ232820 EST 0.136 

40 333458 CH22_FGENES.157_7 0.136 

329899 CH.15_p2gi|6563505 0.136 

322202 AI275056 H&200133 ESTs 0.136 

333991 CH22.FGENES.310J5 0.136 

318617 AW247252 Hs.75514 nucleoside phosphoiytase 0.136 

45 310623 A1341585 Hs.185588 ESTs 0.136 

330489 M23323 Hs£003 CD3E anfigen; epsilon polypeptide (TiT3 complex) 0.136 

309646 AW194694 EST singleton (not in UniGene) with exon hit 0.136 

331068 R00071 Hs.191199 ESTs 0.136 

334285 CH22.FGENES.369J5 0.136 

50 332178 F13689 Hs.100725 EST 0.136 

305724 AA827608 EST singleton (not in UniGene) with exon hit 0.136 

303158 AL138110 Hs.8594 Homo sapiens mRNA containing (CAGJ4 repeat; clone CZ-CAG-7 0.136 

334543 CH22_FGENES.403_8 " * 0.136 

335384 CH22_FGENES.543_26 0.136 

55 336527 CH22_FGENES.839_8 0.136 

334951 CH22_FGENES.465j20 0,136 

325882 CH.16_hsgl|58670B7 0.137 

305134 AA653159 EST singleton (not in UniGene) with exon hit 0.137 

307056 AI148709 EST singleton (not in UniGene) with axon hit 0.137 

60 331943 AA453418 Hs.178272 ESTs 0.137 

331116 R44780 Hs22634 ESTs 0.137 

306094 AA908877 EST singleton (not in UniGene) with exon hit 0.137 

333561 CH22 FGENES.180.18 0.137 

321439 H61962 EST cluster (not In UniGene) 0.137 

65 324594 AA497090 EST duster (not In UniGene) 0.137 

♦ 337926 CH22_EhfcAC005500.GENSCAN.77-4 0.137 

337353 CH22.FGBJES.726-1 0.137 

331836 AA412295 Hs.104774 EST 0.137 

308981 A1873242 EST singleton (not in UniGane) with exon hit 0.137 
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329424 CH.YJlsgi|586a879 0.137 

325829 CH.15_hsgi|5867052 0.137 

331845 AA416863 H&98183 ESTs 0.137 

333854 CH22_FGENES.290J3 0.137 

5 306591 AI0OQ248 EST singleton (not fn UniGene) with exon hit 0.137 

326948 CH.08_hsgi|6456765 0.137 

338935 CH22_DJ32I10.GENSCAN.18-12 0,137 

325960 CH.16Jlsgi|5B67147 0.137 

328377 CH.07_hsgi|5668390 0.138 

10 308851 AI829820 EST singleton (not in UniQene) with exon hit 0.138 

314620 AA424352 Hs.210586 ESTs 0.138 

337592 CH22_C20H12.GENSCAN.6-7 0.138 

338684 CH2^EM^C005500.GBiSCAN.472-3 0.138 

331800 AA400498 Hs.97543 ESTs 0.138 

15 304587 AA505535 EST singleton (not In UniGene) with exon hit 0.138 

333981 CH22_FGENES.310_4 0.138 

332452 AA040369 Hs.11 170 SYT interacting protein 0.138 

305752 AA835278 EST singteton (not In UniGene) with exon hit . 0.138 

311947 T65554 H&351591 EST 0.138 

20 333783 CH22_FGENES273_5 0.138 

337406 CH22_FGENES.754-14 0.138 

327976 CH.06_hsgi|5868212 0.138 

325593 CH.13_hsgi|5865992 0.138 ~~ 

339425 CH22_DJ579N16.GENSCAN.144 0.138 

25 304475 AA428879 EST singleton (not in UniGene) with exon hit 0.138 

309488 AW131104 EST singleton (not In UniGene) with exon hit 0.138 

337532 CH22_FGENES.827-6 0.138 

317234 AA904448 Hs.126368 ESTs 0.138 

312261 AA854425 Hs.144455 ESTs 0.138 

30 328927 CH.08_hsgi|5868500 0.138 

336424 CH22_FGENES,824_9 0.138 

326667 CR20J1S gi|6552455 0.138 

325988 CH.16_hsgI|5867064 0.138 

318446 AW3002B7 EST cluster (not in UniGene) 0.139 

35 336511 CH22_FGENES.834_6 ai39 

335204 CH22_FGENES£08J3 0.139 

303244 AA147472 EST duster (not in UniGene) with exon hit 0.139 

330870 AA1 15804 Hs.187593 ESTs 0.139 

329376 CHJUisgi|5868859 0.139 

40 304703 AA563898 EST singleton (not in UniGene) w8h exon hit 0.139 

333653 CH22_FGENES239_2 0.139 

306799 AI051696 EST singleton (not in UniGene) wSh exon hit 0.139 

304872 AA595289 EST singteton (not in UniGene) with exon hit 0.139 

330812 AA013001 Hs.60563 ESTs 0.139 

45 329568 CH.10_p2gq3962490 0.139 

319210 AA253074 Hs.146261 ESTs 0.139 

334320 CH22_FGBES.374_5 0.139 

300860 AI916949 Hs.149748 ESTs; Weakly similar to weak similarity to ooDagens [Celegans] 0.139 

305866 AA864533 EST singteton (not in UniGene) with exon hit 0.139 

50 312943 AA984364 Hs.119064 ESTs 0.139 

330523 M99439 Hs.83958 traiischjdn4ikeenhanwr(5!split4;homologof Drosophila E(sp1) 0.139 

312708 AI076204 Hs. 135440 ESTs 0.139 

309366 AW072970 EST singleton (not in UniGene) with exon hit - 0.139 

303273 AA316069 EST duster (not In UniGene) with exon hit 0.139 

55 317484 AW274696 Hs.143921 ESTs 0.139 

333239 CH22J=GENES.111_1 0.139 

307126 AI184951 EST singleton (not in UniGene) with exon hit 0.139 

316813 AA826505 Hs.124517 ESTs 0.139 

331746 AA281365 Hs.121640 ESTs; WeaJdy similar to KIAA0386 [H.sapiens] 0.139 

60 308556 AT700145 Hs.172182 poty(A)-bincfing protein; cytoplasmic 1 0.139 

310784 AW086142 Hs.159017 ESTs 0.139 

323831 AA335715 Hs.200299 ESTs 0.139 

307692 A1318342 EST singleton (not in UniGene) with exon hit 0.139 

310570 A1318327 EST duster (not in UniGene) 0.139 

65 327934 CH.06_hsgi|5868184 0.139 

305232 AA670O52 Hs.195188 glycera!dehyde-3-phosphat6 dehydrogenase 0.139 

334756 CH22.FGENES428J5 0.139 

331938 AA451667 Hs.99255 ESTs 0.139 

301393 AJ474722 HS.150B98 ESTs; Weakly similar to KIAA0644 protein [H^aplens] 0.139 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



312005 T78450 
338431 

331214 T90496 
333601 

323481 AA278449 
336911 
338157 
327845 

319109 Z45662 

334763 

329384 

302996 AF054663 
323751 AW452656 
329916 

301993 N49828 

338129 

325704 



331673 W72366 
316807 AI018331 
310743 AW449754 



323855 A) 653164 
304705 AA564064 



333747 

318287 AW015616 
332972 

305704 AA825266 
315699 AW182605 



336400 

321033 H26214 

316522 AI475995 
335715 



333259 
337382 

322346 AA227618 



338500 



315279 AW51113B 

314439 AI539443 

333624 

329237 

330117 

338017 

337854 



305004 AA622328 
302815 N40373 
327823 
328753 

301201 AA904482 
334303 
326453 

311050 AI864581 
308740 A1802711 
331003 H63959 
338010 
336326 

318100 R44308 
320641 R55421 
325855 
330425 HG1728-HT1734 



Hs.13941 ESTs 

CH22.EM^C005500.GENSCAN.351-4 
Hs.16757 ESTs 

CH22_FGENES.213J 
Hs.137429 ESTs 

CH22LFGENES.34W 

CH22JM:AC005500.GBISCAN.20M 

CH.05_hsgl|6531962 
H&90797 Homo sapiens clone 23620 mRNA sequence 

CH22.FGENES.428J2 

CHJOisgi|5868869 

EST cluster (not In UniGene) with exon hit 
Hs.209824 ESTs 

CH.16_p2gi|6223624 
Hs.18602 ESTs 

CH22LBfcAC0Q5500.GENSCAN.197* 

CH.l4_hsgij5867Q28 

CH22_FGENES.590_7 
Hs.40033 ESTs 

Hs.172444 ESTs; Highly similar to transcription regulator [MjnuscuJusj 
Hs.158665 ESTs 

CH.21Jisgi|6004446 
CH.07_hsgi|5868327 
Hs.128665 ESTs 

EST singleton (not In UnlGene) with exon hit 
CH.14_hsgij6469822 
CH22LFGENES.265JB 
Hs.143321 ESTs 

CH22J=GENES.51_5 
EST singleton (not In UnlGene) with exon hit 
H&189183 ESTs; Weakly similar to Nodi pisapiens] 
Oi01_hsgl|5867492 
CH22 FGBIES.823 15 
HS20733 ESTsfWealdy simflarto till ALU SUBFAMILY SX 

WARNING ENTRY 
Hs.122910 ESTs 

CH22LFGBJES.599J5 
CH22_FGENES.650_2 
CH22_FGB€S.118J 
CH22_FGBJES.744-8 
Hs.10882 HMG-box containing protein 1 
CR12Jsgi|5866920 
CH22.EM^C005500.GENSCAN^90-1 
CH22„EWLAC005500.GENSCAN.362-5 
HS256581 ESTs 
Hs.137447 ESTs 

CH22_FGENESJ222J 
CHJLhsgi|5868729 
CH.19_p2 gi|6015201 
CH223*AC005500.GBISCAN.134-1 
CH22JM:AC0Q5500.GENSCAN.38-12 
CH.16_p2gi|4646193 
HS.1 62762 EST 

EST cluster (not In UniGene) with exon hit 
CH.05Jtsgi|5867868 
CR20Jsgi|5867B16 
Hs.197775 ESTs 

CH22_FGENES.373_6 
CH.19_hsgIJ5867399 
Hs215477 ESTs 

H&210337 EST; WeaWy similar to aldolase A [H.sapiens] 
Hs.142722 ESTs 

CH22_EMAC005500.GENSCAN.128-8 
CH22J=GENES.812_4 
Hs.242302 ESTs 

EST cluster (not In UniGene) 
CH.16Jisg1|5867067 
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324583 AM25411 HsJ-581 ESTs 0.142 

326268 CH.17_hs gl|5867267 0.142 

331390 AA460341 Hfi.45008 ESTs 0.142 

338904 CH22JDJ32I10.GENSCAN.10-16 0.143 

5 333098 CH22J=GENES.79J 0.143 

331919 AA446869 Hs.119316 ESTs 0.143 

312214 AI248004 Hs.125187 ESTs 0.143 

323198 AW179174 Hs.7984 ESTs 0.143 

316107 AI2O4001 Hs.1 84014 ribosomal protein U1 0.143 

10 301335 AA885317 Hs.190511 ESTs 0.143 

337392 CH22_FGENES.747-3 0.143 

325543 CH.12_hsgI|6682452 0.143 

305903 AA073O85 EST singleton (not tn UniGene) with exon hit 0.143 

332707 L35594 Hs. 174 185 phosphodiesterase l/nucteotide pyrophosphatase 2 (autotaxln) 0.143 

15 337913 C^EWtAC005500.GENSCAN^9-10 0.143 

301436 AA961061 Hs.131696 ESTs 0.143 

335078 CH22.FGENES.486J 0.143 

338451 CH22.EM^C005500.GENSCAN359-39 0.143 

302777 AJ230640 EST duster (not in UniGene) with exon hit 0.143 

20 330464 JQ3068 Hs.78223 N-acytainir»acyH)8ptide hydrolase 0.143 

330988 H41411 Hs.33855 ESTs 0.143 

328939 CH.08_hsgil6004481 0.143 

308015 AI440174 Hs^28907 EST; Weakly simBar to GUANINE NUCLEOT1DE-BINDING 
PROTEIN BETASUBUNfT-LlKE PROTEIN 

25 12^(H.sapians] 0.143 

328504 CH.07_hsg[)5868471 0.143 

332599 AA402891 Hs^2951 sotute carrier family 29 (nucleoside transporters); member 2 0.143 

335744 CH22.FGENES.601J5 0.143 

322394 AF077208 EST cluster (not in UnlGene) 0.143 

30 323892 AL042661 EST duster (not in UnlGene) 0.143 

318443 A1939323 Hs.157714 ESTs; Weakly similar to NEURONAL ACETYLCHOLINE 
RECEPTOR PROTEIN; ALPHA* CHAIN PRECURSOR 

[H^apiens] 0.143 

336568 CH22_FGENES.843_7 0.143 

35 330958 H08815 Hs.159824 EST 0.143 

327672 CK04Jts gi|5867843 0.143 

335900 CH22 FGENES.635JB 0.144 

336044 CH22_FGENES£79_6 0.144 

318845 AI815951 Hs.33183 ESTs; Weakly similar to estrogen-responsive finger protein; 

40 efpIH^apiens] 0,144 

333483 CH22_FGENES.165J 0.144 

333337 CH22_FGENES.139J 0.144 

305993 AA889197 EST singleton (not in UnlGene) with exon hit 0.144 

335719 CH22_FGENES.599_22 0.144 

45 325682 CH.14Jtsgi)6138923 0.144 

327350 CH.01_hsgi|6249563 0.144 

339291 CH22_BA354I12.GENSCAN.18-1 0.144 

326358 CaiBJisgi|5B67293 0.144 

330316 CH.08_p2 gi|6007576 0.144 

50 308150 AI499346 Hs.174131 ribosomal protein L6 0.144 

338065 CH22 EMAC005500.GENSCAN. 164-1 0.144 

339009 CH22_DA59H18.GENSCAN.18-7 0.144 

327776 CH.05 hsgi|5867964 - 0.145 

336664 CH22_FGB€S.41-8 0.145 

55 321921 AFO70619 EST cluster (not in UnlGene) 0.145 

319346 T70147 Hs.12024 ESTs 0.145 

304265 AA062892 EST singleton (not in UniGene) with exon hit 0.145 

303818 Z45986 Hs£50178 copinell 0.145 

327498 CH.<£_hsgi|6017G23 0.145 

60 335227 CH22_FGENES.513_13 0.145 

339022 CH2*_DA59H18.GENSCAN_2-1 0.145 

302597 H55661 Hs33Q26 ESTs; WeaWysirniiar to sim^ to Enters 

TRAB[OeIegans] 0.145 

308550 AI697008 HSU01811 EST ' 0.145 

65 302175 AA262760 Hs.156015 Homo sapiens chromosome 19; cosmkJ R29381 0.145 

303252 AA156760 EST duster (not in UniGene) with exon hit 0.145 

337414 CH22_FGENES.757-2 0.145 

310362 AI734009 EST duster (not in UniGene) 0.145 

329333 CHJOtsgP68806 0.145 
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336857 Cr^JGENES^I^ 0.145 

332565 AA234896 Hs.25272 E1A binding proteh p30O 0.145 

318634 AI928098 Hs.156832 ESTs ai45 

336318 CH2JLFGENES.801J 0.145 

5 310980 AI923551 Hs.170843 ESTs 0.145 

335346 CH22_FGENES.537_2 0.145 

331196 T65416 Hs.12826 ESTs 0.145 

337607 CH22_C20H12.GENSCAN.17-3 0.146 

331206 T84096 Hs.15284 ESTs 0.146 

10 301793 T80698 EST cluster (not in UniGene) with exon hit 0.146 

319590 AA210878 EST duster (not In UniGene) 0.146 

311394 AI695374 Hs.256231 ESTs 0.146 

324773 AA632554 Hs. 163401 ESTs 0.146 

324841 AI142359 Hs.155316 ESTs 0.146 

15 332260 N70088 Hs.138467 ESTs 0.146 

329276 CHXJb gi|5B68762 0.146 

335887 CH22_FGENES£33_1 0.146 

338294 CH22L.EM^CXXB500.GENSCAN297-1 0.146 

336993 CH22J=GENES.4Q94 0.146 

20 334135 CH22J=GENES.336_2 0.146 

326251 CH.17_hsgI|5867263 0.146 

337398 CK2LFGENES.749-1 0.146 . 

339167 CH22LDA59H18.GENSCAN69-8 ai46 ~ 

316838 AW135418 Hs.161210 ESTs 0.146 

25 325313 CH.HJw gq5866865 0.146 

331047 N66918 Hs.32205 ESTs 0.146 

323915 AL043362 EST duster (not in UniGene) 0.146 

302747 AF062275 EST duster (not in UniGene) with exon hit 0.146 

306317 AA947909 EST singleton (not in UniGene) with exon hit 0.146 

30 334399 CH22_FGENES.382_5 0.146 

326472 CH.19_hsgi]5857404 0.146 

333061 CH2JLFGENES.75J 0.146 

337072 CH22_FGENES.448-5 0.146 

334328 CH22JGENES.375_5 0.146 

35 327039 CH21_hsgi[6531965 ai46 

325576 CH.12_hsgi|6552443 0.147 

315935 A1075804 Hs.132660 ESTs 0.147 

319638 AA323758 EST duster (not in UniGene) 0.147 

334501 CH22_FGENES.397_17 0.147 

40 338238 CH22_EM:AC005500.GENSCAN.2644 0.147 

308636 AI744063 EST singleton (not in UniGene) with exon hit 0.147 

336567 CH22_FGENES.843_6 0.147 

335819 CH22_FGENES*19_2 0.147 

336950 CH22.FGENES361-8 0.147 

45 307055 AI148477 EST singleton (not in UniGene) with exon hit 0.147 

315134 AW504854 Hs.126714 ESTs 0.147 

335834 CH22_FGENES321J 0.147 

327870 CH.06Jhsgq5868131 0.147 

£3802 AA332011 Hs250138 protein phosphatase 2C; magnesium-depend ent; catalytic subunit 0.147 

50 329412 CHJLhsgi[6682553 0.147 

323791 AA333068 EST duster (not in UniGene) 0.147 

324126 AA385315 EST cluster (not in UniGene) 0.147 

327865 CH.06_hsgi|5868130 - - 0.147 

333445 CH22_FGENES.154._2 0.147 

55 321302 AA021351 Hs.158497 WAA0724 gene product 0.147 

336744 CH22.FGENES.118-9 0.147 

323731 AA323414 EST duster (not in UniGene) 0.148 

320289 H07989 EST duster (not in UniGene) 0.148 

305488 AA749000 EST singleton (not in UniGene) with exon hit 0.148 

60 305592 AA780594 Hs.62954 ferritin; heavy polypeptide 1 0.148 

304094 H11295 EST singleton (not in UniGene) with exon hit 0.148 

325040 AW296368 EST duster (not in UniGene) 0.148 

339034 CH2SLDA59H18.GENSCAN.2e-2 0.148 

334504 CH22.FGENES.398 2 0.148 

65 334778 CH22.FGENES.431J 0.148 

320148 U77494 Hs.119687 RAN binding protein 8 0.148 

303584 AW173759 Hs203401 ESTs 0.148 

325826 CH.15_hsgil5867048 0.148 

331192 T55182 Hs.152571 ESTs; Highly similar to IGRI mRNA-binding protein 2 [H.sapkns] 0.148 
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325785 CH.14_hs gl]6381957 0.148 

333166 CH22_FGENES.91JJ 0.148 

336548 CH22_FGENES.841_.5 0.148 

337552 CH2a.C4Q1.GENSCAN.1-4 0.148 

5 331775 AA382742 Hs.97151 EST 0.148 

338938 CH22_DJ32H0.GENSCAN.19-6 0.148 

331869 AA428554 Hs.104894 ESTs; WeaWy similar to fibronectin precursor [H.saplens] 0.148 

332865 CH22_FGENES.28_5 0.148 

328663 CH.07jsgl|6004473 0.148 

10 • 328436 CH.07Jisgl|5888417 0.148 

311158 A1634864 Hs^50789 ESTs; Highly similar to similar to NEDD-4 [Rsapiensl 0.148 

336942 CH22.FGENES.354-2 0.148 

302262 R53169 HS246091 ESTs 0.149 

. 333296 CH22_FGENES.132_3 0.149 

15 333365 CH22J=GENES.142_2 0.149 

311706 AW452392 Hs.252854 ESTs 0.149 

337109 CH22_FGBlES.489-2 0.149 

315062 AW173300 Hs.190201 ESTs 0.149 

333454 CH22J=GENES.157_3 0.149 

20 334784 - CH22_FGENES.432_9 0.149 

333255 CH22JH3ENES.118J 0.149 

337518 CH22_FGENES.814-7 0.149 

320551 AA489268 EST duster (not in UniGene) 0.149 ~" 

323437 AA287567 EST cluster (not in UniGene) 0.149 

25 328761 CH.07_hsgi|58683Q2 0.149 

328787 CH.07_hsgil5868309 0.149 

335261 CH22_FGENES.520_2 0.149 

300827 R16689 Hs.106004 ESTs 0.149 

339263 CH22_BA354112.GENSCAN.10-1 0.149 

30 337412 CH22L.FGENES.756-6 0.149 

334414 CH22 FGENES.384J 0.149 

332931 CH22.FGENES.38_5 0.149 

310801 AW2709B0 Hs. 106346 novel centrosomal protein RanBPM 0.149 

305216 AA669056 EST singleton (not in UniGene) with exon hit 0.149 

35 314779 AA470122 Hs.190261 ESTs 0.149 

338414 CH22_EM^C005500.GENSCAN^41-27 0.149 

303342 AW247361 EST cluster (not in UniGene) with exon hit 0.149 

337509 CH22_FGENES.8064 0.149 

306631 AI001149 EST singleton (not in UniGene) with exon hit 0.149 

40 302533 L36149 Hs.2481 16 chemokine (Cmotf)XC receptor 1 0.149 

336536 CH22.FGENES.839J8 0.149 

324666 T32458 Hs. 14285 ESTs 0.149 

310173 AI767433 Hs.170013 ESTs 0.149 

333595 CH22_FGENES211J 0.149 

45 335975 CH22J=GENES.652J 0.15 

306654 AI003654 EST singteton (not in UniGene) with exon hit 0.15 

335025 CH22_FGENES.475_3 0.15 

32871 1 CH.07.hs QIJ5868271 0.15 

328274 CH.07jisgi|5868219 0.15 

50 325505 CH.12_hsgi|6682451 0.15 

329641 CH.14_p2glI6468233 0.15 

304955 AA613504 EST singleton (not in UniGene) wBh exon hft 0.15 

339103 CH22_DA59H18.GENSCAN.44-10 * 0.15 

329636 CH.12Lp2gip302817 0.15 

55 310118 AI203293 Hs.157489 ESTs 0.15 

326056 CH.17_hsgi|5867184 0.15 

303773 AA769074 EST cluster (not in UniGene) with exon hit 0.15 

303153 U09759 Hs.8325 rnitogen-activated protein kinase 9 0.15 
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TABLE 13A shows the accession numbers for those primekeys lacking unigenelD's for 
Table 13. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from 
Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity 
using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank 
accession numbers for sequences comprising each cluster are listed in the "Accession" 
column. 
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Pkey: Unique Eos probeset Identifier number 

CAT number Gene cluster number 

Accession: Genbank accession numbers 



Pkey CAT number Accession 



322050 24275 J 
321439 1599424.1 
20 321666 13653.22 



25 



300088 622937 J 
322303 704603J 
322394 27492.1 



321758 44275.1 
323109 155498.1 



45 322533 38937.1 
321921 34680.1 
321927 21620.1 



321932 265316.1 
306971 14694.7 



AL1 37589 AA423949 BE222949 BE222694 AI199615 AW8731 16 AI277950 AW044290 AW630096 
H61962 W01567 N75711 

BE259906 AA232518 AA013359 AL035788 AW160822 BE387134 BE002954 BE391839 AW161565 AIB78841 BE616458 
BE409981 BE387308 BE297438 BE315536 AA206924 R12012 AA214169 BE312812 BE387093 H11710 BE312009 
BE26Q569 AA343566 AA219526 R34757 AA219749 BE336733 AA219751 AW411099 AA232408 BE018716 BE3980B9 
AA206253 AA053487 AA1 14224 AV655868 AW732566 BE394087 AW732574 AA313442 BE336875 AA070548 BE259840 
BE019828 AW732341 AA299916 BE019253 BE018238 BE387109 AA232304 BE255589 AW732585 AA181436 AA308777 
AA075802 AW732521 AA314526 AA226747 BE409513 AA206168 BE388292 BE298782 BE387086 AA305310 AV652723 
AA314918 BE615510 AW951763 BE398104 BE385195 BE407165 BE391338 BE390187 BE389189 BE540650 BE249884 
BE385985 BE274245 BE391 124 BE260080 AA182600 BE512821 BE39O090 BE279398 BE279589 BE263454 BE515194 
BE293569 BE272531 BE388814 BE384659 BE271685 BE561043 BE278449 BE302572 AW239076 AI750583 AA376179 
AA1 12632 BE266324 BE266614 R 13105 AA132286 BE296305 A1220355 AA205606 AA219527 AA219519 AW8O4310 
AA083286 BE171208 T19693 AA33832B BE185868 AA903024 T92162 AA330119 BE410404 BE314668 
AW576245 BE207878 AW299993 AI199558 AI285442 AW299994 AW394242 AW394184 
AI35741 2 AI870708 A1590539 W07459 

AW068287 AA310079 BE335702 AA35631 8 AA306059 AA346785 AW402633 AA31 1210 AW402909 N76879 AW4Q2913 
AW401920 AA321636 AA354474 C17297 C16938 AA31 1774 M29871 NM_002872 ZB2188 AW405674 H94176 R89281 
AA2H723 AI014482 AW949347 T27749 AWB04226 AW796984 AW404581 AF077208 NM.014Q29 W68830 W79652 
AA353375 AW575218 AA552192 AA521232 AA702695 AA033975 AW407827 AA829948 N944Q2 AW628604 AB23308 
N57605 AA641662 H42477 N52784 AI753478 AA768493 AA845729 W47391 N55270 A1090117 R89282 BE206172 
AA076S50 AA595650 AI218931 BE049397 A14331 1 0 W741H H94277 AI358627 A1085221 A1662818 AA835967 AW103905 
AI640644 AA835507 AA856887 AA694392 AW337542 A1524410 BB045500 AI440060 AI358801 AW028238 AW205248 
AI718264 R48618 AA3573S8 A1695002 AA897549 AW081065 AI433360 AI310783 A! 620963 Z82188 AA360224 
U291 12 AI656540 AI364875 AI656246 A199094O 

AA169345 AI762857 AI949997 A1B09601 AJ681848 AI221079 AW167404 A1347614 AI611090 AI023472 AQ47683 A1027467 
AW591788 AI380565 AA835735 AA836654 A1244028 AW193159 AI500112 AI918722 AI738693 AT702308 AA805365 
AI766842 

T59538 T59589 T59598 T59542 AF147374 
AP070619R203027B0358 

AJ223366 BE305086 AW820106 AA621983 BE305208 AI738475 AE380189 AW590847 Al 127232 AA622706 AI380858 
AA621 975 AI587036 AA665743 AW204003 AI692234 AIQ02242.AI692219 AW137282 AW268783 AW295910 A1308015 
AW301462 AI318288 Ai 31 8575 AI3181 17 AI345591 A1249650AI246934 AI246864 A1246971 AW268311 AI249654 BE041907 
AW732776 

N72324 N52825 W19526 BE143464 AA376060 

M83667 NM.005195 S63168 MB3667 AW068039 AW630649 AI338577 AI018125 AI269878 AW242440 A1887823 A1342581 
BE222416 AI582847 AI651011 AI660815 A1699574 BE550201 A1926996 AW665855 A1827752 AI761857 BE328168 
BE222451 AI762201 AW000929 AW007207 BE042962 BE551843 BE465373 A1279179 AI949945 BE551862 AW051667 
BE328076 BE222296 AW007229 AW772332 AI279801 AI934526 A1631938 A17701 03 BE041412 A1417900 A1692655 
AI869943 AW2701 19 AI431739 AI703347 AW770568 AWQ25473 AI701497 AI128026 BE328147 AW203980 BE046793 
AW087704 AI674597 A1650732 AI813691 A1472092 AI695224 A1241217 AW207746 AI206840 A1271362 AI631788 A19118B3 
AI914619 A(380585 AI767501 AI823759 AI564116 AI190991 AI377369 A1814122 AI221623 AI354793AI081988 AI391740 
AI337435 BE467386 AI824347 AI565325 A1280038 AI640455 A1819744 BE467803 BE327524 A1149402 AI313187 BE219684 
AW611948 AW665821 A1091260 AW044492 BE220366 AW025381 AW1 83264 AI694865 AI498474 A1129780 AI202028 
AI566792 BE220659 AI928040 A1830696 A1493021 AW612488 AI913152 BE042965 AI631837 A1693873 AI498925 A1768668 
AI401544 BE327023 A1693383 AI769874 AI744003 AW082273 AI686501 AI798177 AI985196 AI090033 AI432342 AI 6899 18 
AI638308 BE468080BE219588 AI912119 BE219787 AW005392 BE326564 AI589039 AI860187 AI758143 AI338168 
AI7Q2936 BE221985 AM98727 AI918196 AI279735 AW771497 A1B60133 AW237834 AW661759 AW0281 1 1 BE503416 
A13801 80 AW61 1715 AI871777 BE045447 BE326444 Ai266547 AI800237 AI823315 A1478368 A1264281 AI675841 A1690041 
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301119 



324019 262782J 
323437 189513.1 
307845 19804.10 
324126 272259.1 
309101 7570J 



45 315703 119175J 



301373 368214.1 

323665 54093.1 

323676 220254.1 

302086 23306.1 

323731 226183.1 

323791 232336.1 

325040 23854.1 

324430 312113J 

323892 477253.1 

309488 1030131.1 

302251 27216.4 

302286 22717.6 

323915 110063 J 

324594 330528.1 

301737 65.1 



AI49801B AI554124 AI239893 AI864054 A1280099 AI192815 A1620465 AJ08Q201 AW002057 BE500986 AI341 131 AI818991 
AI566137 AI123403 BE2191 92 AW183844 A1499842 AW137971 AW138720 AW015526 AW138160 AW2431 63 AW138705 
AW139927 AW14O006 AW138810 AW137450 AW206970 AW135419 AW205974 AA043494 BE465106 AW139955 AI741112 
BE326942 AA043506 AI078957 AI942432 AI392902 AI097O47 A1470599 AA514553 AA984008 N47949 AI6541 14 AA884832 
AI796752 AI765290 A1301 155 AW470358 BE222764 AI823569 A1651 188 AI692695 AI476643 BE504307 AI767573 BE219719 
AI932249 AW467075 A1913633 BE221966 AI091025 AA969215 A1799810 AA931 170 BE048559 AI809606 AI138614 
AI739456 AI674605 AW772068 A1089286 AI6257B7 AI263418 AW008638 AI928389AW628997 AW70010 AI914168 AI760003 
AI203050 A1334069 AI694788 BE045337 A) 948559 A1912982 AI867131 A1192102 AI767583 A1347518 AI566005 AI625684 
A1215888 AI633904 AW 82265 AW614357 AI128030 A1343685 A1914283 AI985003 AI823576 A1493053 AI380285 Atfi33895 • 
AI267880 AI5381 62 AI991 552 BE219479 BE219296 AI3Q2178 AW779296 AI913805 AI631644 AI566772 A1985498 A1942289 
AI935659 A1339092 AI247432 AI686472 AI766886 AI017228 AI333272 AW301668 AI972218 AW082027 AI632974 AI474761 
AI766127 AW236578 AW000986 A1870734 AI222399 AI871249 A1703448 BE464210 AI768037 AI871585 AI767871 AI738757 
AI220732 AI681633 A1768783 AI684463 AJ307339 AI263203 AW665264 BE463969 AI768786 A1439118 AI127913 BE218324 
AI672342 BE220052 AI786163 A1221662 AW197672 AW025300 AI769681 AW612448 BE219757 AW072420 AI669980 
AI830418 AW204353 AA04701 1 AA913868 AI7391 46 AI669954 AW470507 AW614835 AW3Q2151 AW772372 AI762427 
AW3399Q2 AW3O3370 BE464775 AW299818 AW236072 AW1 95060 AW274737 AW263062 AW183846 AI868894 AW300493 
AW172509 AW51 6876 AW593773 AW299474 AW303546 A1817323 A1823624 AJ694005 AI934589 AJ343479 AI861825 
AI962726 Ai765845 AWD80318 AI640227 AI763042 AI768903 AW235386 AA738489 AW341293 AA588585 BE221732 
AI914179 AW611669 AI572789 AW194735 AW236122 AW236007 AW612789 AW197501 AW185046 AI797145 A1864423 
AI45B934 A1342848 AI693227 AI912642 AI689993 AA932572 AA74Q269 AW470392 AW086020 AI221701 T69326 T70461 
A1765579 A1338263 AI431721 AI394249 Al 186462 AIB23571 AI953665 AI497954 AI761057 AI678228 AI640302 AB48742 
AA594626 AA883155 A1972682 AI804774 AI300407 A1433524 AA897341 A1401 175 AI291071 AA021213 AI126509 AJ948955 
AI218835 AA903938 AA502610 AI498320 AA584267 AA935265 A1478253 AA489S58 AA975053 AA715326 AA557139 
AA126417 AA971455 AA557319 AI499738 AA911438 AI913637 AA494506 N90793 AB90724 AA131667 AA128164 
AA046840 A1262557 M131729 AA594926 T59467 AA436907 AA044630 A1589177 AI278237 A188049B A1431822 AA708934 
AW612558 AI634069 W03610 AM 92272 BE550862 AI400879 AA7085O7 Ai 128003 A1375308 AI271423 Al 199552 AA125977 
AI366498 AA458662 A1694382 AA044627 AI636263 AI786270 T80146 AW014724 Al 87081 2 AI948781 AA369865 AKJ94721 
AV\fi71817 AI262898 A1244680 T69252 AI934148 AA046357 W19109 AA028157 AW021924 AA253491 AI189397 AI934388 
D58282 W21323 W24288 AI682972 AA293683 AA284566 AV65951 1 AA434184 H87089 AA040038 N57464 AA343709 
AW805815 R89837 

BE621 320 BE266806 BE276582 AW516729 AF 142578 AW451687 AK000069 AA325236 BE168997 W73105 AA715365 
BE278873 AA808894 AA3B6371 AW517942 AW750993 BE140314 BE392384 BE621757 AA318192 BE548173 AW152607 
AW166898 AA352215 AW841506 T59802 AF147378 AA335719 AW956069 T59668 AA826362 AI981329 A1290469 
AW197375 A1805651 AA160748 AA581089 A1968889 AA581100 AA501478 AI621069 AA468534 AA503715 AA658457 
AI144504 BE387827 AA159880 
AW177009A1381610 

AA287567 AA252404 AW967735 AA2B7568 AA761222 AA865644 AA831245 
BE514807 R43224 AI363450 AA45Q226 AF030942 
AA385315 A1627453 AJ050695 AI348281 

AB40462 AI5B3268 AA079086 AI950777 AI301866 AI925108 AW876954 AW877000 AA525418 AA888549 AI934220 
AW380220 AA804858 AI927576 T61151 AW384053 BE391691 AA533856 AA248400T48202 N57156 R68346 R26020 
AL050332 W30806 H61 369 AA092592 AA230324 BE27121 7 AW372903 T48772 AA3580Q2 AA0943Q2 AA559856 AW373308 
AW373315 AW373297 AW373311 AW373314 AW373309 AW877055 AW770140 AW379805 A1581 609 AW364144 AA078921 
AA715432 AA654210 AI004899 AA602209 W47464 AA506588 R26822 AU076528 A1535743 A1535704 AI535681 
AA402307 060405 061237 D59S91 AW964877 AA325215 AI459739 N3607O N25658 AA083684 AW293368 AI761958 
AI741205 AI693175 AW873603 All 43269 All 87124 N25199 H1 9323 AI650842 AW316825 AA083842 AA935650 AW298404 
AI472001 AI648568 R17676 R41625 Al 123237 R 17677 A1206866 F36920 AI654713 F34084 AA618029 AI915139 AW275194 
AW514577 D80420 AW149850 Z40953 AI867B81 AA927547 AA974344 AIB25793 AI635565 A1652157 BE504748 AW295759 
F16600 AW839796 F01781 AA909730 AA984010 
AA595235 AW973839 T03040 
AW248307 AA313452 AW951927 AA355961 BE566080 

AI702835 AI758919 AI685405 AI952108 A1299207 A1400767 AW1 05389 A1952710 AA845312 AJ7841 18 AI537315 
X57138 NM.003514 Z98744 BE253911 BE256314 AI095013 Af 138475 
AA323414 AW664013 AI8Q9377 AI276041 AW296883 AI798340 
AA333068 AA331863 AA331838 AW962531 AA331442 

AW296368 AA247632 AK002030 R15304 T08775 AW975664 A) 186801 AA730668 AW190918 AI141 176 AW513211 

AI275071 AA988601 BE042933BE045713AW087176 

AA464018 AA464079 AM68142 

AA846318W15478AL042661 

AW131104BE246610 

AA333340AW055834 R49755 U33428 . 

R58438AA358612 

AL043362 AA350031 AW751972 BE5491 18 
AA497090 AI351679 AI350914 

AI815981 AF287269BE260960BE263991 AA311733 F12145 F07345Z43604 T29948 H641 02 243611 T35364 N40667 
AI909763 AW751045 AA160594 AI816064 AI307240 AI951554 AA641031 AA293045 AI942492 A1687077 R78689 H12368 
AA894728 Al 124930 AI423498 AA777759 AA614585 AW071 822 T66288 A141 8558 H21480 AI33501 1 AI051728 AA293436 
AW302233 AW188628 N26393 AI076557 A1311022 AW451505 H62593 Z39666 H12315 AI761351 AI364142 F02935 
AW571491 T35366 AB40745 H64151 AA503793 AA831948 AI627686 A1761531 F03591 F09782 
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301763 1688575J 
301780 18597.2 



10 



15 



301793 239325 J 

303049 102592J 

301883 19477J 

301872 27494.4 

301893 6561 J 



310382 653318.1 
303181 74060J 
302569 17513J 
20 324893 4870 J 



25 



303244 9334J 



30 



35 



303252 149690.3 
303273 67758.1 
40 302640 21194.1 



45 

303342 189722 J 

302703 7075.1 

50 



55 



60 318446 604736.1 
302815 42200.1 



65 302879 36555.1 
318540 1018709.2 
302928 22118.1 



R01279 R05898 TB6522 

R05735BE349800 R37388N79751 R10115 AA702039 AA836147 AA505716 AI049661 AI499239 R54072 AI023394 
AA827710 W60285 W500038 AI884786 AA827191 AA810075 AW005088 R70248 AI858560 AW078678 AA631306 H52839 
AW085835 AI656182 AA737178 AW136923 AA281028 AA570316 AA722871 AA362737 AI217268 BE242373 R011 13 
AA628946 AI394527 AW402308 A1361 1 10 A1917585 T99639 AAB05326 N44577 Al 394021 AW403385 T23949 A1497766 
T966Q2 AAB34947 AI693908 Z33450 T92127 BB41896 AI933301 BE251540 BE252269 N50968 AI695531 AW575523 
AW296889 N93796 N89924 AB61804 AI085251 AA810694 BE30301 1 AA743784 R13478 AA358771 AA325294 AW964880 
6E258953 R541 16 AW881039 AW602593 
BE265837 AA340632 T96304 T96075 T72780 H51978 R09868 

AW408042 AW407562 BE172835 BE396B93 BE269184 AA045741 BE004187 AW751261 W74283 

BE263301 A!418863NMJ»5194 X52560 AW328683BE298869 D63161 

H84730T73262 

T80334 BE292758 AK000854 H16998 BE253691 R88508 AA357663 AW955288 AW579550 N98864 AA595201 AI742967 
AA602658 AI091433 AA813367 AI983217 AW298007 AI628490 AI708037 AI560654 AI032883 R88509 R38972 AI687783 
A1560153 AW874581 N69891 AA993617 H51180 A1269042 AI281358 AW591213 A1017724 A1262859H16997 R38991 
A1804355 AI868988 AI669525 AW023081 AL047848 
AT734009 A1263076 AW272255 AI792912 

AA452366 AA351338 BE262590 BE262591 AA074050 AA389867 BE161346 
AC004472 BE312721 BE273942 F11928 T65358 BE612432 BE261576 BE179884 

AA324119 AW246199 BE395368 BE261676 BE382334 BE394701 BE304548 T31840 BE398128 BE398019 BE296693 
BE379564 BE269460 BE397065 Z42029 BE305028 AK000549 BE536182 BE314372 AW393349 T50987 AA069735 
BE386997 AW381699 T51050 W95025 AA477678 AA348306 AW956831 AW06281g AL040397 BE305160 AA315419 
AW249929 AA295944 A1635946 AI870259 AI951 125 AW028250 A1885184 AW873113 A1077544 AW025091 AI817594 
AI401718 AWDQ8245 AI499064 AA599687 A1016890 AA765638 W93340 AA588708 AW519173 R51917 AA676778 AI084871 
AA687684 AI860840 AI81 1 921 AW514730AA477561 N78845 AA779894 AA778559 AI968953 T16188 T32828 AA991426 
AI474472 AW73542 AI828972 AW247906 AA977415 AW591489 AA876008 AW191893 AA074278 AWB74099 Z40196 
AW083615 F01544 T55984 AI29041 3 AI972167 AI365049 T36028 AI042568 BE560076 W171 19 AA196376 T47999 R54309 
AK001269 AL354613 AA147472 AA490B03 BE207628 AW816113 AA085574 AW503392 AA299910 AW750305 BE079539 
BE079484 BE512B38 AK001593 AW968772 AW967440 AW206280 AA251270 AI627B86 AA303599 AA147473 BE206616 
AA490611 AA715039 AW590866 AW590447 Al 86451 2 AA204731 AA894490 BE001136 AA612785 AA237035 AA149960 
Z44257 R12986 AA448446 AI734041 AA422167 BE220551 R66041 R32927 R32942 AA258773 AW388142 R53730 N54624 
AW880296 AA253485 AW954441 H98989 AW614348 AI654838 AA779793 AW237213 N66635 A1186812 AA947479 
BE158011 AI859480AW805579 N52010AA806305 AI628445 AW270990AA778165 AA149949 A1650728 AA749108 
AA687257 AI261661 AA747442 AA481351 AA206339 AA9O3407 AW473306 AI688930 AA262261 AA448310 AA748820 
A1347430 BE465692 R32839 AW510564 AA436408 AA257971 AA253362 AA938330 AA51 3150 AA976840 AA6871 17 
A1281547 AA046243 R32825 AI631554 AW139818 AI244536 R52946 AW235443 R40183 AA299909 AA811958 AI302918 
240213 BE158047 BE158060 AA767245 AW748159 AW500735 AA094074 
AW393348 AW393350 AW386713 AW384705 

AA316069 BE274224 AL120803 BE170052 BE170039 AI906340 BE091310 AA491506 AW838S75 AW863111 
AW973784 AWB43642 AA557573 AA578088 AI125161 AA349349 AI372794 BE312586 BE312777 T32148 AW239077 
AI905357 Z42685 AW298772 R18578 AA780425 AA325971 AI37Z7B3 R10658 AA295Q21 AW885349 AW885288 BE271987 
AW368519 AA349350 AA233207 R88464 AA434299 RQ2058 R00019 R54563 Z44886 R2O150 AW368328 AW368321 
AW802152 W79803 H12809 AA028951 AW3B7382 AA295247 H46355 AA905620 R54564 H12765 AW950608 AA028952 
AA366908 AI085652 R43207 R77854 AI672848 T28547 AA427734 AA572853 AA769934 AI242108 R00020 R02059 R10659 
AI185270 AI041 890 NM.O0OO8O X68403 F03854 AI652442 AI766431 AA976913 AI989882 AA471024 AI802727 AI8241 12 
F02169 A1890843 BE250876 BE252859 AL15741 8 R78326 
AWD68570 AW247361 AA252638 AI751982 BE260758 BE293073 AW293303 

AB040951 AK0Q2094 AA676593 W44644 N42376 Z45942 AWB41844 BE541378 AA358274 AA213391 T88771 NM_015493 

AL1 17489 N88248 N31714 N36273 N31721 AW576263 AA449380 A1366135 AA551576 AW149789 217418 AW474331 

AA058181 AI75361 1 AA046428 AA488007 AA300764 N44732 AA377697 AA346752 AA485787 AA894546 AA115295 

AA299914 R88096 AA367342 AW884666 W84522 AA426325 AI983849 AA873315 AA873307 AI355170 AA53467B 

AA969227 AI127202 AW083323 AI338244 AW020877 AA780019 N33426 AW069314 N63079 AI926527 AA1 15270 

AW886601 AI357402 AA599312 AI480356 AI926969 AA429402 N33197 AW886733 R88205 N52803 AWQ21988 AA213392 

BE139656 AI142383 AA427844 AA954743 AA233622 AW073382 AA426326 AA493560 AA425133 N24819 AW19516 - 

AI571515 AI147373 A1628677 AI214877 AA992123 H71599 AA029095 AA622262 AW117398 AW275286 AI911337 

AAB64950 T94173 AI475634 AI70141 1 AI287696 T94091 AA505746 A1184310 AI350967 AI083596 W74274 AI954381 

AI832767 A1368443 AA1 95578 AW87441 6 A1005421 AW014339AA908660 A1350791 AW241382 A1473104A1275186 

AA515528 AA1 94897 AA782901 AW069414 F20248AA426011 AI305169 A1832109 AI570082 AW072984 A1492474 

AA919076 AL049024 W79889 N42400 AA625435 AW963887 AA233420 

AW779971 AW300287 AW152002 AW069505 AI866447 AI298231 AI146920 AT692267 AI872876 

BE397032 AJ292529 N40373 N34073 AA321112 AW959902 AA258103 AW860213 BE549059 BE295027 BE296657 

AA300789 AI971491 AW513665 AA909530 AI951045 AW058103 AI971506 AI061239 AA600054 AI000807 AA9B9975 

AA281492 AW593654 AA321 1 1 1 AW298633 AI278754 AI863862 A12B5506 AA989727 F331 14 T16079 AI762625 AW921 03 

AW770346 AW026768 AW68710 AI499987 AA310412 AA622784 AA642297 AI866427 

H11802 T66097ARW2831 

R42185 AW939055 730280 Z43366 R54166 

AA938905 AA574056 AA714466 AI805592 A1123431 AA229723 AA620759 AI004450 AW299820 AI949299 AW874308 
AA626037 A9741 12 AA931563 AF073924 AA995769 AI766441 A1367730 Al 081 342 AA235800 AA235801 AI138970 
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303701 1155179J 
303759 447287J 
303773 356532 J 
303778 174437J 
303784 414659.1 
303845 50211J2 
162688J 



20121 452027J 
319590 171338.1 
305186 17456.1 



319638 226485.1 
320257 163534 J 
320289 115941J 
304703 33971.42 



AA719797 AA759343 X89673 AA759344 AA312909 X87825 Y10529 AC006271 M758739 BE501015 AA909905 AF065857 
AC006271 AA970044 

AA158883 AA171835 AA187049 AA143546 BE299538 BE614280 BE621705 BE299684 BE619550 BE613099 BE619558 
BE514331 BE617716 BE612920 BE615742 BE25B739 BE621539 AI43451 1 BE546696 BE614324 BE379359 BE250106 
BE250681 BE299592 BE300272 BE616805 BE397385 BE562024 BE271246 BE250556 BE280311 BE561995 BE618755 
BE276126 BE546275 BE311547 BE262155 BE281082 BE513087 BE546891 BE514289 BE397389 BE267442 BE545455 
' BE614483 BE293447 BE270710 BE2B1071 BE267458 BE542Q95 BE262701 BE513634 BE548116 BE299546 BE619604 
BE512885BE616638BE266173BE258933BE259710BE268569BE563861 BE614871 BE537509 BE250108 BE515323 
BE538868 BE25O081 BE277706 BE410127 BE619445 BE250753 BE304969 BE616348 BE546878 BE544962 BE410346 
BE267256X17206NWL002952BE304541 BB819171 BE259655 BE549186 BE314944 BE613101 BE378069 BE621110 
BE542752 BE257029 BE531315 BE619306 BE267328 BE259439 BE297093 BE280651 BE407684 BE250201 BE312819 
BE535432 BE279917 BE312626 BE5311 18 BE378744 BE275370 BE250195 BE409980 BE274432 BE266637 BE279321 
BE622382 BE28Q232 BE26S816 BE378977 BE300145 BE250204 BE547609 BE264377 BE266688 BE259746 BE260829 
BE619517 BE388097 BE264025 BE618945 BE614758 BE312249 BE294359 BE531121 BE622300 BE615109 BE544354 
BE61499B BE393239 BE297520 BE393221 BE278818 BE279309 BE265476 BE618772 BE61S185 BE265144 BE249837 
BE312230 BE407843 BE253884 BE407645 BE615804 BE619058 BE559512 BE383249 BE613497 BE294351 BE295062 
BE622385 BE390654 BE535438 BE563186 BE396374 BE270842 BE386110 BE260368 BE250186 BE265875 BE537229 
BE253369 BE256997 BE269482 BE264959 BE279072 AA662160 BE280733 AA858428 BE561308 BE267285 BE561422 
BE563181 BE304614 BE295437 BE619424 BE275863 BE394315 BE408109 BE541866 BE253772 BE618236 BE535261 
BE296490 BEZ78212 BE563154 BE257245 BE262274 BE513032 BE378567 BE394152 BE618947 BE269302 BB46516 
BE536792 BE615187 BE261 186 BE615367 BE619289 BE261184 T49376 AL031671 BE273400 BE563457 BB45597 
BE515169 AA150323 AA158723 AA079033 BE313333 AA160100 BE271115 BE294302 BE273051 BE273048 BE622390 
AA837947 BE387721 AW973Z77 AA808731 BE280792 AA160444 BE256723 AT745520 AA643017 BE549441 BE293858 
AW975249 AI620819 AW089494 AI434549 BE305231 AA081262 BE2B0101 AA522507 AI95088D AA187460 BE386860 
AWB59229 BE170489 BE620149 BE548218 AA316696 AA484426 AI567740 AA1 60605 AW939805 AA089573 BE30O194 
BE391331 AW975419 H26808 BE545544 BE615974 AW800241 BE616222 W17343 BE387865 T53697 C03943 BE617637 
BE315130 T52942 T50588 N74693 AA187107 T59919 AW797397 AA206447 AA854619 T57175 A1570296 AW517964 
AA158269 A1282220 W25297 AI580710 BE262453 A! 185868 AA526485 A1288051 AJ582513 AA1 00675 AW615567 
BE395354 AI472725 BE314881 BE621281 N99921 A1282689 A1432725 AW73201 1 AA872254 BE205807 T59435 AI282712 
AA650505 AI004374 AA725260 BE313161 T60173 AI371260 BE385641 AW751812 AA078827 AI491858 AW33622 
AA2191 18 A1002092 AA996003 AA064604 AI250287 A1304397 AI453213 AA653630 AI524573 AI440306 H48802 AA157843 
AA7 15629 AW973788 AA932493 AI347563 AA1 81 309 T67880 AA643033 AW467498 AA1 1 5904 AA93541 0 AA483032 
AA084568 W25246 A1567588 AA155732 AA158614 AA888319 AA158568 AA188422 AI309183 AA084817 AA157995 
AI859659 AA188008 A1287379 AI540675 AA085212 AW028391 AA173297 BE256792 AA182854 BE378771 BE538571 
AA079037 BE281597 AA643926 W81011 AA159344 AA320691 AA877597 T57107 AW263819 AI630413 A1619605 AI687579 
AA970560 A1368942 AI927104 AW419220 AI620Q51 AA128490 AA 120825 AA079520 AA199648 AW188403 BE045224 
AW265533 AA074338 AA102685 AW778399 AA192451 AA182771 AW366812BE281418AA211094AA131073AA487924 
AW674848 AI568103 AA171934 F30349 AW088785 AA581370 AA205482 AW352296 AW517565 AJ376249 AA1 58884 
AJ340509 T59965 AA085193 AA071570 AI874045 AA852755 BE045217 A W1 89428 AA211141 AA652134 AW97729 
AAS94817 AI811459 BE535857 AW769697 AW167892 AW149305 AIB64981 AW272126 AW023245 A1439266 A1953198 
AA160912 AI718580 BE537547 AA501448 AA069308 L07393 AA353007 AA079235 A1539140 AA7401 54 W58341 AA888403 
BE299000 AA196413 BE613327 BE261523 AA866599 AWB44713 AI691 159 A1079975 AW327479 BE180731 AA984805 
AW500732AW504061 
AA774672AW504164 
AA769074 AA570769 AA808585 AA808682 
AW505368 AA218610 Ft 1852 T65345 AA397606 
BE297711 AW505574 AA704983 
F07942T08033 

BE38626S BE148823 723215 AI906290 AA299906 BE2071 97 AW0741 14 AI760368 AI005358 AW662201 AA188988 
AI690711 AA775103 AW072931 A1684269 AW129364 AW615634 AI049941 AW874040 A1352633 AA188989 A1287775 
AA868774AA599660 
AA780365 AA909233 A1275542 
AA210878AA215684 R11101 

M13560 AA336951 AA161015 H72814 T69687 R75705 T61319 AA158454 R50579 T56649 AI214156 T70375 R31655 
H64997 AW800487 H49110 AA634206 H42384 H21783 AI560152 AA664230 H423Q2 R48708 AA01 3277 T61 901 T92417 
AA875985 T61 952 T63055 AA430725 AA458964 AA578746 AI582385 T63000 AI499875 H64998 AA022538 AI364804 
AI865211 AW39714 AI224059 AI249917 T59258 AM77806 AA715834 AA916120 R38304 R35899 R82985 H25524 HB2984 
AW516728 T54642 AA079866 H27555 AA455820763919 R79450 AI431241 AA937349 AA127213 AA421729 H61 196 
T63894 AA013050 AA079133 W96364 AA487926 A1762796 H26377 AM33386 AI865423 AW371475 R98189 AA643978 
AI718204AW381954AI862735 
AA323758R12731 R14082 

R17531 AW960899 AA338366 AW673294 BE047729 BE047722 AA330746 AW841797 H05030 AI142105 R12654 
H07989 AJ239462 H24544 AA078369 R74153 

BE512926 BE304794 AA129140 AA052922 AA092258 BE378058 BE615391 BE615218 BE616188 AI214126 H05675 
W56857 AI028525 BE617241 BE531271 AW856227 T56489 AA322005 AW794148 AF170577 BE615738 AA005138 L76930 
L76932 L76933 X95410 AW389462 BE563092 AW997937 AA263156 A1520992 AW947350 AA522535 AW945921 AV653776 
AW884835 AW947338 AI687178 AW945799 AI905627 AW948449 AV653751 AW945924 AA563898 AW94581 0 AW945832 
AW371 449 AW945864 AW948447 AW945910 AA643002 AA522680 AA522715 AA578840 AA523279 AA826150 AW945809 
AW405998 AA551809 R23173 AA595545 AW389497 AI933770 AI125053 AI471 803 AW795856 AW796937 W30675 H70317 
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308612 
308636 
308814 
308851 
65 308981 



306051 19085J 



321163 171122.1 
321235 1102181J 
320603 4297 J 



320841 185591J 
320651 58648J 
321325 28266 J 

305704 464759.-1 

322011 23158J 

306407 

306454 

306516 

306518 

306526 

306534 

306590 



306631 
306654 
306786 
306799 
308023 
308070 



306805 
306814 
306873 
306911 



308258 
308289 
308311 
308332 
308511 



310570 1071946J 
305022 
305060 
305070 



H68296 T59240 AA397650 H59852 AA938072 AA978010 R35643 T89735 AW361585 AW196153 AI538069 AA604540 
A1434259 R49181 T58717 AW062486 AW796966 A1648384 R77733 AI623502 BE1 71342 BE171303 R35658 AW974883 
AW1 48898 A1500045 A1540710 AI540392 AW0091 72 AW2771 99 A1371312 AI500096 AI470297 AW372940 AW844562 
AWB44560 AW797965 AI691 146 X07062 AW799199 H60666 AA837684 AF 130734 T25952 A1933771 AI914860 AW391 925 
AW783843 AW795012 AW366709 AW750987 AW750985 R35765 AW844942 AW750986 H64920 R34651 X86703 
BE018103 BE018083 BE293253 AW247083 BE207643 BE514793 BE183238 AA376427 AW273850 AW043786 BE439973 
AL045428 AI889O50 AA026496 AI422924 AI884485 W96068 AA020872 F371 19 AA714378 AAQ21 107 AA01 1 141 AI554001 
AI375841 AI469097 AA335219 AW987315 AI692177 AA410448 AI568658 AA582647 AA026419 AA281639 AW515248 
AW007777 AA010840 AW188439 AI805423 AI146210 BE301590 AA744414 AA745392 AW167423 AA622659 AW000878 
AI432387 AA760930 BE047189 AA021605 AV658045 AI093347 AA588594 H 631 43 AA639556 A1308976 AA379270 
AA633407 AI874329 AI206484 AI493895 AI694103 AI249682 AA973765 AA872445 AI125446 AA287Z72 AW069761 
AA682569 AW009712 BE542774 R50167 BE301574 AA991202 AA502006 A1219819 AW074373 AA617996 AJ521242 F25241 
AW615812 R16774 AA335218 AW673800 H25778 A146B557 A1886986 AI560759 AI460075 AA502968 AA503273 AA610680 
AA287274 AA554020 AA284889 AA916536 AW489457 AW273250 AW673708 AW512948 AL041071 AI446042 AA903535 
BE172441 A1282411 AW265021 AA81 0799 AI559865 AA729332 AW00461 1 AW129451 AA659019 BE208239 AA610825 
H03511 BE383995 R16474 AA281701 AW009244 AA287424 AA558139 AW384081 

F08147 AW408359 AW949429 R23785 AW247442 AA305512 T29095 AA905130 BE246361 BE244981 AA220199 BE504058 
X80878 AA533727 AA6086O1 AW005964 AI811627 A1367037 A1277985 AI493719 A1277848 AA854982 AW247298 AK16345 
AI041295 A1887378 AA781241 AI674270 AW628959 AI383083 BE504391 AA729421 AA552188 AA373387 AW880360 
AW875262 AW875369 AW581540 AW875358 AW581568 R23735 AW134768 
W03912 AW971410 AA506385 AA209530 H73495 H48629 W56149 
H56752 AW340384 N49521 

AA853660 AK001668 BE386425 BE563549 BE296124 BE298950 R51419 U46295 BE147292 AA360056 R48018 AW845348 
N47383 AIB17280 AI6719Q2 AA988104 AA479464 N56996 A1192374 AI927558 AA659888 AI799903 AA548397 AI161167 
AI656333 AI418829 AW592671 BE327906 AW513346 AI88B579 AW469410 AW512809 D25682 AA576079 AA479354 
T30342 R51307 T16044 H29063 AW079357 AI339477 R47914 AI986068 AI870065 AI868489 AI521099 A1582732 AA995540 
AW957299 AA352608 AA676752 AA4 105 10 AA358874 A1885724 AA853679 A1699265 AW188789 N47380 AA233715 
BE258194 R55421 R55643 H42362 AA243884 

AW886407 AA489268 R57015 R58094 BE077459 BE077423 BE546995 AW849216T69383 AW9381 11 H60337 BE221073 
AB033100 AA347038 BE260325 AW961669 AL047207 AA347037 AI766894 AA601045 AI559897 AW139033 AW274622 
AW172684 AW089070 AA804340 AW798925 
AAB25266 

AL1 37354 AL043375 
AA971985 ' 
AA977992 
AA9B9542 



AA989713 
AA991487 
AI000246 
AI000248 
AI001149 
AI003654 
AI041589 
AI051696 
A1452732 
AI470948 
AI475914 
AI055966 
AI066577 
AI086929 
AI095365 
AI127883 



AI565612 
AI571211 
AI581855 
A1591235 
AI687580 
AI719930 
AI735634 
AI744063 



AI829820 
A1873242 

A1318327 A1318328 A1318495 

AA627416 

AA635771 

AA639783 
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305079 
305134 
303977 
305216 
305263 
305266 
305396 
305403 
305488 
305549 
305601 
305610 
305621 
305710 
305724 
305744 
305752 
307018 
307055 
307058 
305801 
305830 
305836 
305852 
305858 
305866 
305867 
307126 
305903 



AA641329 

AA653159 

AW512978 

AA669056 

AA 679467 

AA679772 

AA721052 

AA723748 

AA749000 

AA773530 

AA780975 

AA782319 

AA789095 

AA826544 

AA827608 

AA831819 

AA835278 

AI140639 

AI148477 

AI148709 

AA845997 

AA857665 

AA858043 

AA862455 

AA863103 

AA864533 

AA864572 

AI184951 

AA873085 



328803 c_7_hs 
328809 CjJlS 
305949 AA884409 
328829 C_7_hs 
330021 c16jp2 
330024 c16_p2 
330028 c16_p2 
330049 d7_p2 
305993 AA889197 
330095 c19_p2 
330098 c19j>2 

307205 A1192479 
307427 A1243437 
307491 AI268539 
307581 A1284415 
307588 AI28553S 
337672 CH22.6OO2FGL-UNK.EMAC0O 
337693 CH2?_6O3OFGL.UNK_EMAC00 
337738 CH22_6083FGLJJNK-EMAC00 
307692 A1316342 
307806 AI351739 
309107 A1925823 
309230 AI970747 
339338 CH22_8300FG_LINK_BA354I1 
309257 AI9B4183 
309366 AW072970 
309422 AW087175 
325207 elOJis 
325257 c11 hs 

309646 AW194694 
309651 AW195850 
325313 dljis 

309924 AW340812 
334030 CH22_1308FG^320XLINK^EM 
334040 CH22 J31 8FG_322_8_L!NK^EM 
334083 CH22J361FG_327_38.UNieE 
332810 CH22_26FG_7J2_UNK_C65E1 
302747 32813J AF062275 L0383O 

302753 33029 1 M74299 M743Q2 M74303 

302777 33803.1 AJ230640 AJ230648 
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304094 
302824 
302996 
325870 
304240 
304410 
304443 
304475 
304522 
304678 
304705 
306004 
306008 
306013 
306082 
336174 
9)6094 
304823 
304872 
304918 
304955 



35372,1 
41196J 
C18JB 



H11295 

U21260U21258 
AF054663AF124197 R70292 

AA009802 
AA284508 
AA399444 
AA42B879 
AA465405 
AA548556 
AA564064 
AA889992 
AA894390 



AA908508 
OI22_3567FG_710_1JJNKJ)A 

AA908877 
AA584837 



306286 
306295 
306317 
306347 
306365 
306398 
330401 
330463 



AA602697 
AA613504 
AA93384Q 
AA936892 
AA937331 
AA947909 
AA861144 



AA970548 
entrez_D2B383 D28383 



460L2 



1374.-8 
10404.2 



NIUL001 055 AA332948 U26309 U09031 L19955 L10819 A1366043 X84654 U71086 AV654451 AJ007418 AA053625 
BE168856 AA376730 H12694 AA810348 AA621972 AI818950 AV645367 AI819966 AA910602 AW512449 H67893 AI310497 
AI304330 A1339217 AW1935B8 AW438688 AI818970 AW316799 AA906527 AA777570 N47673 AI336428 AW945133 
AI038506 R29692 AW194197 A1304748 H12639 AA0531 78 AA49321 3 AA67695B AA1 13154 A1313469 AB68239 R93183 
W24532 U52852 U54701 AL046864 AA365795 
U11872 

U24488NM_007116 
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TABLE 13B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in Table 13, For each predicted exon, we have listed the genomic 
sequence source used for prediction. Nucleotide locations of each predicted exon are also 
listed. 



Pkey: Unique number corresponding to an Eos probeset 

Ret Sequence source. The 7 digit numbers in this column are Genbank Identifier (Gl) numbers 

Strand: Indicates DNA strand from which exons were predicted. 

NLposition: Indicates nucleotide positions of predicted exons. 



Pkey Ref 

332791 Dunham, L etal 

332792 Dunham, I. etal. 
332610 Dunham, I. etal 
332944 Dunham, I. etal 
332972 Dunham, I. etal. 
333133 Dunham, I. etal 

333154 Dunham, I. etal 

333155 Dunham,!. etal. 
333227 Dunham, I. etal 
333230 Dunham, 1. etal 
333298 Dunham, I. etal 

333304 Dunham, I. etal 

333305 Dunham, I. etal 
333365 Dunham, I. etal 
333363 Dunham, I. etal 

333391 Dunham, I. etal 

333392 Dunham, I. etal 
333397 Dunham, I. etal 
333403 Dunham,!. etal 
333413 Dunham, I. etal 
333445 Dunham, I etaL 
333479 Dunham,!. etaL 
333481 Dunham, t. etal 
333483 Dunham, I. etal 

333516 Dunham,!. etal 

333517 Dunham, I. etaL 

333518 Dunham, I. etal 
333531 Dunham, I. etal 
333566 Dunham, I. etal 
333572 Dunham, I. etal 
333586 Dunham, 1. etaL 
333588 Dunham, I etal 

333594 Dunham, L etal 

333595 Dunham, I. etal 

333600 Dunham, I etal. 

333601 Dunham, I. etal. 
333607 Dunham,!. etal. 

333612 Dunham, total 

333613 Dunham, I. etal. 

333614 Dunham, I. etal. 
333624 Dunham, I. etal. 
333626 Dunham, I. etal 
333635 Dunham, 1. etal 
333637 Dunham, I. etal. 
333642 Dunham, L etal 
333647 Dunham, L etal 

333653 Dunham, I. etal 

333654 Dunham, I. etal 

333656 Dunham,!. etal 

333657 Dunham, I. etal 

333658 Dunham, I. etaL 



Strand Nt_posfUon 



PUS 


7272U-/0315 


PUS 


73381-73768 


rUS 




rflJS 


cA\ 4820-241 4$>2 


PUIS 


OCTOHCO QC7040C 

25721 52-2572236 


PUS 


3350Q5B-336Q185 


Plus 


OC 4 COOT OC4CTHO 

361 5887-36 1601 9 


PiliS 


QC4COOO 004TTV>0 


Plus 


399Z866-399296o 


Plus 


3995507-3996507 


Plus 


4581 537-4581 847 


Plus 


4629943-4630242 


Pius 


46303884630645 


Plus 


47868834787283 


Plus 


49071794907277 


Plus 


491 6697-491 6780 


□III* 
PIUS 


49182944918433 


Plus 


49224664922635 


Plus 


49251404925256 


Plus 


49438244943974 


Plus 


5097827-5097885 


Plus 


5272855-5272939 


Plus 


5286358^5286505 


Plus 


5297945-5208105 


Plus 


5570204-5570390 


Plus 


5570729-5570925 


Plus 


5571761-5572025 


Plus 


5622622-5622684 


Pius 


59542266954473 


Plus 


6026896-6027189 


Plus 


6246834-6247314 


Plus 


625544^6255779 


Pus 


630899&6309450 


Plus 


63231036323348 


Plus 


6355629-6355925 


Plus 


6360075-6360442 


Plus 


6504431-6504690 


Plus 


6549563-6549697 


Pius 


6550643-6550748 


Pius 


6551227-6551389 


Pius 


6595146-6595244 


Plus 


6614174-6614467 


Plus 


6663683-6663973 


Plus 


6674968-6675134 


Plus 


67087606709139 


Plus 


6772502-6772779 


Plus 


68111306811392 


Plus 


6816731-6816993 


Plus 


6822087-6822406 


Plus 


6831369-6831445 


Plus 


6835282-6835474 
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Plus 
Plus 
Plus 
Plus 
Pius 
Plus 
Pius 
Plus 
Pius 
Plus 
Pius 



333659 Dunham, I. etaL Plus 
333684 Dunham, Utah Plus 
333686 Dunham, I. etal. Plus 
333697 Dunham, I. etal. 
S 333698 Dunham, I. etal. 
333699 Dunham, 1. etal. 
333703 Dunham, L etal. 
333709 Dunham,!. etal. 
333747 Dunham, I. etal. 

10 333774 Dunham, I. etal. 
333775 Dunham, tetat 
333806 Dunham, tetat 
333843 Dunham, tetat 
333854 Dunham, tetat 

IS 333873 Dunham, tetat Plus 
333880 Dunham, tetat Plus 
333885 Dunham, t etaL Plus 
333918 Dunham, tetat Pius 
333947 Dunham, tetat Phis 

20 333961 Dunham, tetat Pius 
333981 Dunham, teUt Plus 
333991 Dunham, tetat Plus 
333994 Dunham, I. etaL Plus 
334030 Dunham, tetat Pius 

25 334083 Dunham, tetat Plus 
334111 Dunham, tetat Plus 
334135 Dunham, tetat Plus 
334218 Dunham, tetat Plus 
334249 Dunham, tetat Plus 

30 334262 Dunham, total Plus 
334264 Dunham, tetat Plus 

334327 Dunham, I. etal. Plus 

334328 Dunham, tetat Plus 
334340 Dunham, t etal. Plus 

35 334454 Dunham, t etal. Plus 
334504 Dunham, I. etaL 
334508 Dunham, tetat 
334512 Dunham, tetat 
334582 Dunham, tetat 

40 334659 Dunham,!. etaL 
334721 Dunham, tetat 
334723 Dunham, t etaL Phis 
334730 Dunham, I. etal. Pius 
334774 Dunham, tetat Pius 

45 334778 Dunham, I. etaL Pius 
334851 Dunham, tetat Plus 
334885 Dunham, tetat 
334902 Dunham, tetat 
334905 Dunham, L etaL 

50 334906 Dunham, L etal. 
334910 Dunham, t etal. 
335018 Dunham, tetat 
335025 Dunham, tetat Plus 
335033 Dunham, tetat Pius 

55 335044 Dunham, 1. eUL Plus 
335142 Dunham, tetat 
335157 Dunham, t etaL 
335160 Dunham, tetat 
335174 Dunham, tetat 

60 335188 Dunham, tetat 

335190 Dunham,!. etal. 

335191 Dunham,!, etal. 
335193 Dunham, Letat Plus 
335204 Dunham, t etal. Plus 

65 335222 Dunham,!. etal. Plus 

335226 Dunham, tetat 

335227 Dunham, tetat 

335309 Dunham, Letat 

335310 Dunham, Letat 



Pius 
Pius 
Pius 
Pius 
Pius 
Phis 



Plus 
Plus 
Plus 
Plus 
Pius 
Plus 



Pius 
Phis 
Plus 
Plus 
Plus 
Plus 
Plus 



Pius 
Pius 
Plus 
Plus 



7169561-7169742 

7177117-7177302 

7203859-7203934 

7205279-72)5383 

7206101-7206175 

7215559-7215663 

7229730-7229835 

7605884-7606206 

7716509-7716636 

7729983-7730149 

7877475-7877666 

7978762-7978887 

80294464029524 

81332664133429 

81518234152133 

8154352-8154437 

83071244307215 

85798884579966 

66179994618104' 

87823744782643 

88374194837551 

6852749-8352894 



98370164837081 

10279365-10279531 

10457085-10457183 

12680289-12680378 

13190430-13190574 

13231452-13231561 

13234447-13234544 

13577413-13577496 



13642407-13642522 
14326506-14326738 
14510206-14510398 
14514936-14515122 
14545933-14546366 
15026255-15026371 
15460624-15460726 
15796816-15796987 
15805317-15805399 
15967830-15967834 
16251857-16252178 
16276180-16276395 
17820110-17820810 
19233667-19233787 
19317083-19317195 
19322553-19322680 
19323493-19323590 



20688288-20688415 
20743941-20744050 
20753188-20753314 
20842088-20842682 
21465105-21465186 
21543302-21544341 
21573388-21573497 
21631301-21631447 
21669118-21669328 
21680807-21680876 
21681110-21681183 



21750636-21750726 
21885542-21885608 
21890838-21890930 
21892145-21892289 
22500158-22500276 
22500714-22500831 
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335311 Dunham, ULal Pius 
335355 Dunham, LeUt. Phis 
335362 Dunham, LeLal Plus 
335368 Dunham, ULal. Plus 
5 335384 Dunham, ULal. Plus 
335385 Dunham, LeLal. Plus 
335436 Dunham, ULal. Plus 

335440 Dunham, LeteL Plus 

335441 Dunham, ULaL Plus 
10 335450 Dunham, I. eLal. Plus 

335453 Dunham, I. eLal. Plus 
335458 Dunham, I. eUL Plus 
335464 Dunham, ULaL Plus 
335496 Dunham, L eLal. Plus 
15 335497 Dunham, I. eLal. Plus 

335498 Dunham, I. eLal Plus 

335499 Dunham, I. eLal. Plus 

335500 Dunham, I. eLal. Plus 
335507 Dunham, 1. eLal. Plus 

20 335510 Dunham, I. eLal. Plus 
335513 Dunham, L eLal. Plus 
335627 Dunham, L eLal. Plus 
335G51 Dunham, I. eLal Plus 
335655 Dunham, I. eLaL Plus 

25 335656 Dunham, I. eLaL Plus 
335658 Dunham, 1. eLaL Pius 
335663 Dunham, 1. eLaL Plus 
335685 Dunham, 1. eLaL Plus 
335667 Dunham, I. eLal. Plus 

30 335668 Dunham, I. eLaL Plus 

335689 Dunham, I. eLaL Plus- 

335690 Dunham, I. eLaL Plus 
335715 Dunham, I. eLaL Plus 
335719 Dunham,!. eLaL Plus 

35 335734 Dunham,!. eLaL Plus 
335744 Dunham, I. eLaL Plus 
335809 Dunham, I. eLaL Plus 
335819 Dunham, LeLal. Plus 
335822 Dunham,!. eLal. Phis 

40 335872 Dunham, I. eLaL Plus 
335885 Dunham, LeLal. Plus 

335968 Dunham, LeLal. Pius 
335971 Dunham, I. eLaL Plus 
335975 Dunham, I. eLaL Plus 

45 335976 Dunham, I. eLaL Plus 

335969 Dunham, I. eLaL Plus 
335990 Dunham,!. eLaL Plus 
336010 Dunham, I. eLaL Pius 
336093 Dunham, I. eLaL Plus 

50 336126 Dunham, L eLaL Plus 
336129 Dunham, LeLal Plus 

336187 Dunham, LeLal Plus 

336188 Dunham, LeLal. Plus 
336225 Dunham, L eLaL Pius 

55 336371 Dunham,! eLaL Plus 
336373 Dunham, LeLal. Plus 
336377 Dunham, LeLal. Plus 
336380 Dunham, LeLal. Plus 
336383 Dunham, LeLal. Pius 

60 336384 Dunham, LeLal. Plus 

336385 Dunham, I. eLal. Plus 

336386 Dunham, I. eLal. Plus 
336441 Dunham, LeLal. Plus 
336444 Dunham, LeLal. Plus 

65 336484 Dunham, I. eLaL Plus 
336497 Dunham, L eLaL Pius 
336499 Dunham, L eLaL Plus 
336503 Dunham, L eLaL Plus 
336548 Dunham, I. eLaL Pius 



22501602-22501676 
22779222-22779516 
22809167-22809461 
22843040-22843184 
22918150-22918263 
22919072-22919339 
23427793-23427923 
23458702-23459017 
23460632-23460724 
23480190-23480270 
23483333-23483459 
23490034-23490143 
23500331-23500496 
24164386-24164545 
24167666-24167869 
24172082-24172161 
24176698-24176869 
24178236-24178326 
24219973-24220039 
24222975-24223118 
24224272-24224496 
25150005-25150061 
25317560-25317696 
25333211-25333369 
25333601-25333751 
25336315-25336406 
25342680-25342802 
25344098-25344287 



25346313-25346447 
25454350-25454604 
25455442-25455625 
25565941-25566052 
25593936-25594101 



25716483-25716615 
26310772-26310909 
26356341-26356470 
26364087-26364196 



26933436-26933534 
27743843-27744029 
27752808-27753017 
27801321-27801391 
27809041-27809187 
27983788-27983860 



30057891-30058105 
30062259-30062348 
30433494-30433585 
3043487O-3O4350O4 
30833614-30833788 



33976308-33976504 



34005784-34005964 
34007429-34007559 
34007879-34008159 
34012365-34013115 
34187606-34187663 
34190585-34190718 
34237425-34237505 
34267190-34267245 
34267504-34267572 
34271306-34271372 
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55 



60 



336715 
336603 
336805 
336B50 



336911 



10 



15 336993 
337076 
337109 
337123 
337151 
337189 
337241 



20 



25 337396 
337414 
337418 
337461 
337480 
337482 
337483 
337490 



30 



337532 
35 337552 
337584 
337611 
337672 



40 



337738 



337944 
337954 
337996 



45 



338016 
338174 
50 338176 
338238 
338277 



338316 
338323 



338410 
338414 
338460 
338481 
338489 



65 338514 



Dunham, L etaL Phis 
Dunham, I. etaL Plus 
Dunham, I. etaL Plus 
Dunham, I. etaL Plus 
Dunham, L etai. Plus 
Dunham, I. etaL Plus 
Dunham, I. etaL Plus 
Dunham,!. eta). Plus 
Dunham, I. etaL Phis 
Dunham, I. etaL Plus 
Dunham, I. etaL Plus 
Dunham, I. etaL Plus 
Dunham, I. etaL Plus 
Dunham, Letal. Pius 
Dunham, 1. etaL Plus 
Dunham, 1. etaL Plus 
Dunham, I. etaL Plus 
Dunham, L etai. Pius 
Dunham, Letal. Plus 
Dunham,!. etaL Plus 
Dunham, 1. etaL Phis 
Dunham,!. etaL Plus 
Dunham, 1. etaL Plus 
Dunham, I. eLaL Phis 
Dunham, I. etaL Plus 
Dunham, L etaL Phis 
Dunham, I. etaL Plus 
Dunham,!. etaL Plus 
Dunham, LetaL Phis 
Dunham,!. etaL Plus 
Dunham, I. etaL Phis 
Dunham, I. etaL Plus 
Dunham, I. etaL Phis 
Dunham, LetaL Pius 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, I. etaL Plus 
Dunham, Letal. Plus 
Dunham, Letal. Plus 
Dunham, LetaL Plus 
Dunham, Letal. Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, Letal. Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Phis 
Dunham, LetaL Plus 
Dunham, LetaL Phis 
Dunham, Letal. Plus 
Dunham, Letal. Plus 
Dunham, I. etai. Phis 
Dunham, 1. etaL Plus 
Dunham, LetaL Plus 
Dunham, I. eLaL Pius 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, I. etaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, Letal. Pius 
Dunham, LetaL Plus 



34428521-34428637 

1896402-1896478 

3110198-3110314 

61069044106990 

6126661-6126786 

7745284-7745355 

8130457-8130612 

11035818-11035984 

12818687-12818891 

12875843-12875912 

13203550-13203973 

15096270-15096324 

19338177-19338679 

21166580-21 166650 

22052874-22052942 

23106433-23106510 

24225887-24225954 

27280182-27280313 

30395182-30395285 

30804624-30804780 

3133339941333580 

'31585902-31586067 

31953012-31953205 

3201404942014131 



3321971443219779 
33227865-33227946 
33237292-33237427 
3331857143318644 



3418726944187366 

19497-18600 

945236445452 

1482883-1483016 

33312364331313 

35759754576153 



6286377-6286470 
63430334343172 



7445532-7445633 

7601363-7601520 

7863131-7863310 

12771102-12771268 

12774072-12774223 

14661936-14662015 

16167622-16167962 

16463958-16464539 

17089711-17089988 

17154655-17154792 

17155309-17155574 

18611213-18611407 

18953492-18953581 

19292807-19292916 

19345573-19345660 

20233372-20233488 



21142605-21143049 
21253847-21253974 
21379420-21379655 
21636361-21636509 
23540239-23540334 
23711167-23711241 
24219427-24219509 
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338660 Dunham, I. etal 
338704 Dunham, I. etaL 
338847 Dunham,!. etaL 
338887 Dunham, I. etaL 
338895 Dunham, I. etaL 
338915 Dunham, Letal. 
338825 Dunham, I. etaL 
338936 Dunham, L etaL 
338952 Dunham, Letal. 

338980 Dunham, I. etaL 

338981 Dunham, Letal. 
338986 Dunham, Letal. 
339009 Dunham, Letal. 
339017 Dunham, I. etaL 

339045 Dunham, I. etaL 

339046 Dunham, I. eta!. 
339059 Dunham, I. etaL 
339067 Dunham, I. etaL 
339069 Dunham, 1. etaL 
339078 Dunham, LeLal. 
339084 Dunham, L etaL 

339101 Dunham, LetaJ. 

339102 Dunham, L etaL 

339103 Dunham, L etaL 
339115 Dunham, Letal. 
339157 Dunham, L etaL 

339166 Dunham, I. etaL 

339167 Dunham,!. etaL 

339288 Dunham, I. etai. 

339289 Dunham,!. etaL 
339291 Dunham,!. etaL 
339407 Dunham,!. etaL 
332865 Dunham, I. etaL 
332881 Dunham, Letal. 

332930 Dunham, Letal. 

332931 Dunham, Lata!. 
332984 DunhanUetaL 
332986 Dunham,!. etaL 
332997 Dunham, 1. etaL 
333051 Dunham, Letal 
333061 Dunham, I. etaL 
333064 Dunham, Letal. 
333096 Dunham, L etal. 
333099 Dunham, L etaL 
333106 Dunham, Letal 
333160 Dunham,! etaL 
333163 Dunham, 1. etaL 

333165 Dunham, L etaL 

333166 Dunham, I. etal. 
333170 Dunham,!. etaL 
333174 Dunham, Letal 
333188 Dunham, Letal 
333214 Dunham, I. etal. 
333232 Dunham, L etal. 
333237 Dunham, Letal. 
333239 Dunham, I. etaL 
333255 Dunham, 1. etaL 
333259 Dunham, I. etaL 
333274 Dunham, 1. etaL 
333290 Dunham, I. etaL 

333295 Dunham, Letal. 

333296 Dunham,!. etaL 

333310 Dunham, L etaL 

333311 Dunham, Letal. 

333312 Dunham, Letal. 

333313 Dunham, L etaL 
333315 Dunham,!. etaL 
333318 Dunham,!. etaL 
333321 Dunham, LetaJ. 



Plus 24387122-24387266 

Plus 25230432-25230548 

Plus 27995337-27995420 

PIUS 28465244-28465384 

Plus 28598893-28599135 

Phis 28824881-28824977 

Plus 28883892-28884036 

Phis 29148022-29148160 

Plus 29418831-29418968 

Phis 29896789-29896874 

Plus 29897917-29898008 

Plus 30007287-30007415 

RUS 30348477-30348598 

PIUS 30420896-30421090 

Plus 30744285-30744356 

Plus 30746269-30746420 

Plus 30814655-30814801 

Plus 30869347-30869412 

Plus 30880975-30881070 

Plus 30914310-30914423 

Plus 30944556-30944803 

Plus 31158047-31158123 

Plus 31169321-31169563 

Plus 31170343-31170454 

Plus '31459869-31459927 

Plus 32131701-32131833 

Plus 32210902-32211006 

Plus 32213567-32213730 

PIUS 33169611-33169691 

Plus 33166756-33186903 

Pius 33205057-33205247 

Plus 34189461-34189620 

Minus 1391482-1391218 

Minus 1563520-1563184 

Minus 2022565-2022497 

Minus 2023851-2023562 

Minus 2632606-2632457 

Minus 2635398-2635206 

Minus 2710509-2710375 

Minus 2991973-2991840 

Minus 3029631-3029527 

Minus 3030722-3030623 

Minus 3184234-3184118 

Minus 3206796-3206674 

Minus 3230744-3230547 

Minus 3654893-3654678 

Minus 3665124-3564962 

Minus 3674052-3673905 

Minus 3694664-3694567 

Minus 3733394-3733299 

Minus 3764284-3764210 

Minus 38269903826863 

Minus 3966559-3966437 

Minus 4001551-4001365 

Minus 40033264003219 

Minus 4095861-4094462 

Minus 42978834297716 

Minus 43067694306639 

Minus 43891464388954 

Minus 45307344530554 

Minus 45492904549198 

Minus 45507664550644 

Minus 46373154637232 

Minus 46379334637844 

Minus 46387944638635 

Minus 46393974639277 

Minus 5405980-5405876 

Minus 46426364642564 

Minus 46490804648934 



WO 02/30268 PCT/USO 1/32045 



333327 Dunham, UtaL Minus 46579474657828 

333335 Dunham, I. etaL Minus 46726564672564 

333337 Dunham,!. etaL Minus 46779304677841 

333454 Dunham, I. etaL Minus 5137007-5136880 • 

5 333458 Dunham, I. etaL Minus 5143942-5143806 

333459 Dunham, I. etal. Minus 514454*5144344 

333470 Dunham, I. etaL Minus 5223319-5223088 

333493 Dunham, I. etal. Minus 46373154637232 

333496 Dunham, I. eta!. Minus 5404643-5404523 

10 333498 Dunham, I. etal. Minus 5405980-5405876 

333510 Dunham, I. etal. Minus 555762*5557469 

333546 Dunham, t etal. Minus 5886643-5886442 

333561 Dunham, I. etal. Minus 5903659-5903590 

333738 Dunham, t etal Minus 7552160-7552084 

15 333780 Dunham, LetaL Minus 7750367-7750277 

333783 Dunham, LetaL Minus 7751850-7751777 

333818 Dunham, LetaL Minus 7911959-7911762 

333894 Dunham, L etal. Minus B168855-6188709 

333897 Dunham, I. etal Minus 81943904194284 

20 333900 Dunham, t etaL Minus 82002684200122 

333909 Dunham, LetaL Minus 822963^6229477 

333938 Dunham, t etal. Minus 85128054512564 

333944 Dunham, LetaL Minus 8557051-8556936 

334040 Dunham, LetaL Minus 934289*9342934 

25 334154 Dunham, LetaL Minus 10570714-10570572 

334178 Dunham, LetaL Minus 11755052-11754971 

334188 Dunham, LetaL Minus 11925963-11925834 

334273 Dunham, L etal. Minus 13265608-13265522 

334282 Dunham, LetaL Minus 13285293-13285178 

30 334285 Dunham, LetaL Minus 13289990-13289793 

334286 Dunham, I. etaL Minus 13291759-13291569 

334303 Dunham, LetaL Minus 13454331-13454217 

334305 Dunham, LetaL Minus 13456310-13456209 

334306 Dunham, LetaL Minus 13461157-13461049 
35 334320 Dunham, LetaL Minus 13496857-13496717 

334352 Dunham, LetaL Minus 13675908-13675828 

334353 Dunham, LetaL Minus 13683722-13683596 
334359 Dunham, LetaL Minus 13728664-13728534 
334363 Dunham, LetaL Minus 13740004-13739812 

40 334365 Dunham, LetaL Minus 13742078-13741971 

334399 Dunham, LetaL Minus 14186289-14186163 

334409 Dunham, LetaL Minus 14195181-14195075 

334414 Dunham, LetaL Minus 14234033-142339% 

334470 Dunham, LetaL Minus 14389581-14389442 

45 334483 Dunham, LetaL Minus 14428355-14428281 

334489 Dunham, LetaL Minus 14455428-14454288 

334498 Dunham, L etal. Minus 14483789-14483700 

334501 Dunham, LetaL Minus 14487509-14487356 

334502 Dunham, LetaL Minus 14488605-14488526 
50 334543 Dunham, LetaL Minus 14834496-14834116 

334622 Dunham, LetaL Minus 15191678-15191609 

334650 Dunham, LetaL Minus 15371251-15371178 

334680 Dunham,!. etal. Minus 15520047-15519887 

334745 Dunham, t etal. Minus 16049960-16049653 

55 334756 Dunham, L etal. Minus 16128678-16128528 

334758 Dunham, L etal. Minus 16132368-16132233 

334761 Dunham, LetaL Minus 16138424-16138319 

334763 Dunham, LetaL Minus 16148136-16148077 

334784 Dunham, I. etaL Minus 16294548-16294360 

60 334790 Dunham, I. etaL Minus 16307576-16307509 

334793 Dunham, LetaL Minus 16330748-16330681 

334802 Dunham, LetaL Minus 16413158-16413026 

334820 Dunham, LetaL Minus 16764338-16764249 

334824 Dunham, LetaL Minus 16857777-16857674 

65 334832 Dunham, L etal. Minus 17173957-17173760 

334842 Dunham, LetaL Minus 17464352-17464181 

334844 Dunham, LetaL Minus 17503891-17503768 

334857 Dunham, LetaL Minus 18488368-18488242 

334927 Dunham, LetaL Minus 19988711-19987853 
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334939 Dunham, Utah Minus 20131162-20131054 

334851 Dunham, I. etal. Minus 20147708-20147602 

334969 Dunham, LetaL Minus 20188176-20188020 

334972 Dunham, I. etal. Minus 20294734-20294611 

5 335050 Dunham, I. etal. Minus 20884109-20883951 

335078 Dunham, tetat Minus 21059529-21059458 

335102 Dunham, Utal. Minus 21313841-21313598 

335105 Dunham, I. etal. Minus 21320563-21320440 

335110 Dunham, I. etal. Minus 21334136-21333811 

10 335111 Dunham, Lata). Minus 21335946-21335809 

335115 Dunham,!. etal. Minus 21388250-21388146 

335116 Dunham, LetaL Minus 21388573-21388414 

335185 Dunham,!. etal. Minus 21651593-21651522 

335186 Dunham, LetaL Minus 21656436-21656338 
IS 335230 Dunham,). etal. Minus 21899517-21898678 

335236 Dunham, I. etal. Minus 21915016-21914870 

335243 Dunham,!. etal. Minus 21933519-21933365 

335249 Dunham,!. etal. Minus 21950851-21950669 

335258 Dunham, LetaL Minus 22043431-22043262 

20 335261 Dunham, LetaL Minus 22063937-22063772 

335276 Dunham, LetaL Minus 22154036-22153937 

335279 Dunham, LetaL Minus 22168834-22166638 

335330 Dunham, LetaL Minus 22556589-22556422 

335331 Dunham, LetaL Minus 22556823-22556708 
25 335334 Dunham, LetaL Minus 22560390-22560136 

335346 Dunham, LetaL Minus 22641097-22640918 

335349 Dunham, LetaL Minus 22661861-22661271 

335611 Dunham, LetaL Minus 25070825-25070706 

335612 Dunham,!. etal. Minus 25072328-25072142 
30 335671 Dunham,!. etal. Minus 25358629-25358533 

335676 Dunham, LetaL Minus 25395274-25395152 

335680 Dunham, I. etal Minus 25402437-25402361 

335750 Dunham, LetaL Minus 25732501-25731972 

335752 Dunham, I. etal. Minus 25757026-25756890 

35 335755 Dunham,!. etal. Minus 25763806-25763747 

335767 Dunham, I. etal. Minus 25818547-25819218 

335774 Dunham, L etal. Minus 25883733-25883572 

335777 Dunham, I. etal. Minus 25885770-25885599 

335778 Dunham,!. etal. Minus 25886469-25886334 
40 335797 Dunham, LetaL Minus 25958182-25958030 

335600 Dunham, LetaL Minus 25985373-25985280 

335818 Dunham, LetaL Minus 26323886-26323744 

335834 Dunham, LetaL Minus 26391707-26391530 

335840 Dunham, LetaL Minus 26420596-26420538 

45 335844 Dunham, LetaL Minus 26433427-26433344 

335846 Dunham, LetaL Minus 26436727-26436621 

335856 Dunham. LetaL Minus 26662452-26662346 

335887 Dunham, LetaL Minus 26939225-26938782 

335888 Dunham, LetaL Minus 26943037-26942820 
50 335889 Dunham, LetaL Minus 26946988-26946901 

335890 Dunham, LetaL Minus 26949087-26948665 

335893 Dunham, LetaL Minus 26973898-26973747 

335895 Dunham, LetaL Minus 26975307-26975239 

335896 Dunham, LetaL Minus 26977639-26977558 
55 335900 Dunham, LetaL Minus 26980354-26980238 

335907 Dunham, LetaL Minus 27013352-27013273 

335943 Dunham, LetaL Minus 27446610-27446378 

335956 Dunham. LetaL Minus 27653729-27653635 

335959 Dunham, LetaL Minus 27682313-27682145 

60 335962 Dunham, LetaL Minus 27704276-27704144 

336040 Dunham, LetaL Minus 29036458-29036300 

336044 Dunham, LetaL Minus 29043828-29043727 

336047 Dunham, LetaJ. Minus 29050617-29050466 

336068 Dunham, LetaJ. Minus 29252077-29251969 

65 336143 Dunham,!. etal. Minus 30135948-30135854 

336158 Dunham, LetaJ. Minus 30163730-30163610 

336174 Dunham, LetaL Minus 30241988-30241839 

336223 Dunham, I etal. Minus 30816306-30816195 

336245 Dunham, LetaL Minus 31420569-31420509 
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336274 Dunham, UtaL Minus 32085468-32085303 

336318 Dunham, I. eUL Minus 33364452-33364338 

336326 Dunham, I. etaL Minus 33567328-33567201 

336339 Dunham, Let* Minus 33798479-33788330 

S 336340 Dunham, UtaL Minus 33812069-33811915 

336355 Dunham, I. eUL Minus 33874750*33874649 

336392 Dunham, I. etaL Minus 34015668-34015738 

336393 Dunham, L etal. Minus 34016145-34015951 

336394 Dunham, Letai. Minus 34016457-34016298 
10 336400 Dunham,!. etaL Minus 34023437-34023298 

336402 Dunham,!. etaL Minus 34024090-34023981 

336413 Dunham, I. etaL Minus 34046702-34046576 

336424 Dunham, Letai. Minus 34055549-34055491 

336425 Dunham, L etaL Minus 34058544-34058446 
15 336437 Dunham, I. etaL Minus 34074154-34074090 

336447 Dunham, LetaL Minus 34198207-34197996 

336449 Dunham, LetaL Minus 34204707-34204577 

336466 Dunham, LetaL Minus 34213195-34213046 

336492 Dunham, t. etaL Minus 34255578-34255437 

20 336511 Dunham, LetaL Minus 34277480-34277351 

336512 Dunham, LetaL Minus 34278373-34278275 

336520 Dunham, LetaL Minus 34319184-34319101 

336522 Dunham, LetaL Minus 34320169-34320056 

336524 Dunham, L etaL Minus 34321055-34320921 

25 336527 Dunham, I. etaL Minus 34322071-34321966 

336534 Dunham, LetaL Minus 34326797-34326620 

336536 Dunham, Letai. Minus 34327678-34327538 

336542 Dunham, Letai. Minus 34331316-34331183 

336556 Dunham, Letai. Minus 34375244-34374907 

30 336557 Dunham, Letai. Minus 34375443-34375341 

336558 Dunham, I. etal. Minus 34375625-34375698 

336559 Dunham, Letai. Minus 34376430-34376261 

336560 Dunham, I. etal. Minus 34376814-34376596 

336561 Dunham, LetaL Minus 34377168-34376928 
35 336597 Dunham, LetaL Minus 7627912-7627757 

336601 Dunham, I. etal. Minus 13265853-13265654 

336642 Dunham, LetaL Minus 1304281-1304212 

336645 Dunham, I. etaL Minus 1351268-1351168 

336662 Dunham, L eLal. Minus 2158060-2157993 

40 336664 Dunham, LetaL Minus 1993558-1993481 

336678 Dunham, LetaL Minus 2022565-2022497 

336684 Dunham, Letai. Minus 2158060-2157993 

336686 Dunham, I. etal. Minus 2160698-2160486 

336714 Dunham, LetaL Minus 3094026-3093871 

45 336719 Dunham, LetaL Minus 3331631-3331503 

336736 Dunham, LetaL Minus 40931284093041 

336744 Dunham, LetaL Minus 4333001-4332848 

336786 Dunham, LetaL Minus 5419973-5418873 

336793 Dunham, LetaL Minus 5631345-5631237 

50 336859 Dunham, LetaL Minus 8201756-8201561 

336863 Dunham, t etal. Minus 83966738396425 

336933 Dunham, LetaL Minus 11760045-11759981 

336942 Dunham, LetaL Minus 12027537-12027455 

336960 Dunham, Letai. Minus 13267243-13267172 

55 336969 Dunham, Letai. Minus 13725722-13725643 

336971 Dunham, LetaL Minus 13732308-13732221 

337003 Dunham, Letai. Minus 15523541-15523422 

337011 Dunham, I. etaL Minus 16106423-16106080 

337070 Dunham, LetaL Minus 19034423-19034321 

60 337072 Dunham, I. etal. Minus 19077452-19077323 

337086 Dunham, LetaL Minus 19657011-19656881 

337140 Dunham, Letai Minus 22849450-22649388 

337193 Dunham, Letai. Minus 24594969-24594874 

337256 Dunham, LetaL Minus 27659956-27659876 

65 337278 Dunham, LetaL Minus 28429017-28428648 

3372B4 Dunham, LetaL Minus 28491414-28491094 

337293 Dunham, LetaL Minus 28846334-28845873 

337316 Dunham, LetaL Minus 29657129-29656997 

337326 Dunham, LetaL Minus 30017199-30017069 
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337382 Dunham, LetaL Minus 31233666-31233579 

337392 Dunham, LetaL Minus 31442311-31442229 

337406 Dunham, LetaL Minus 31884840.31864588 

337412 Dunham, LetaL Minus 31916487-31916312 

5 337419 Dunham, LetaL Minus 32021496-32021170 

337436 Dunham, LetaL Minus 32257869-32257739 

337455 Dunham, LetaL Minus 32434517-32434425 

337509 Dunham, LetaL Minus 33414613-33414498 

337518 Dunham, LetaL Minus 33796750-33796647 

10 337529 Dunham, LetaL Minus 34043668-34043546 

337533 Dunham, LetaL Minus 34193388-34193261 

337539 Dunham, LetaL Minus 34254490-34254322 

337551 Dunham, L eta). Minus 34524446-34524362 

337553 Dunham, LetaL Minus 24230-24160 

15 337591 Dunham, LetaL Minus 1006414-1006184 

337592 Dunham, LetaL Minus 1007791-1007634 

337593 Dunham, LetaL Minus 1009460-1009291 
337607 Dunham, LetaL Minus 1355719-1355637 
337612 Dunham, LetaL Minus 1570235-1570142 

20 337635 Dunham, LetaL Minus 21696902169569 

337824 Dunham, LetaL Minus 45595404559266 

337825 Dunham, LetaL Minus 45671554567005 
337850 Dunham, LetaL Minus 507714^5076943 
337854 Dunham, LetaL Minus 5153435-5153272 

25 337913 Dunham, LetaL Minus '61498436149786 

337915 Dunham, LetaL Minus 5922748-5922690 

337968 Dunham, LetaL Minus 7095797-7095680 

338010 Dunham, LetaL Minus 7754282-7754184 

338012 Dunham, LetaL Minus 7761421-7761351 

30 338017 Dunham, LetaL Minus 7884521-7864401 

338065 Dunham, LetaL Minus 7235048-7234950 

338094 Dunham, LetaL Minus 9595602-9595440 

338129 Dunham, LetaL Minus 10915338-10915237 

338132 Dunham, LetaL Minus 10989617-10989530 

35 338150 Dunham, LetaL Minus 11478551-11478355 

338157 Dunham, LetaL Minus 11731444-11731375 

338195 Dunham, LetaL Minus 13484103-13483972 

338255 Dunham, LetaL Minus 15242294-15242231 

338276 Dunham, LetaL Minus 16109555-16109398 

40 338431 Dunham, LetaL Minus 19747608-19747496 

338448 Dunham, LetaL Minus 20151152-20151054 

338451 Dunham, I. eta!. Minus 20174266-20174193 

338477 Dunham, LetaL Minus 20821897-20821838 

338534 Dunham, LetaL Minus 2177123841771170 

45 338682 Dunham, LetaL Minus 24800712-24800461 

338684 Dunham, LetaL Minus 24827522-24827428 

338689 Dunham, LetaL Minus 24693073-24892972 

338695 Dunham, LetaL Minus 25104153-25104016 

338825 Dunham, LetaL Minus 27664798-27664712 

50 338842 Dunham. LetaL Minus 27824238-27824079 

338893 Dunham, LetaL Minus 28491807-28491631 

338904 Dunham, LetaL Minus 28766345-28766253 

338935 Dunham, LetaL Minus 29071537-29071461 

339022 Dunham, LetaL Minus 30523414-30523289 

55 339034 Dunham, LetaL Minus 30621603-30621422 

339190 Dunham, LetaL Minus 32403103-32402985 

339212 Dunham, LetaL Minus 32494335-32494210 

339213 Dunham, LetaL Minus 32496590-32496440 
339216 Dunham, LetaL Minus 32504250-32504109 

60 339233 Dunham, LetaL Minus 32751331-32751238 

339258 Dunham, LetaL Minus 32934756-32934615 

339262 Dunham, LetaL Minus 32971258-32971090 

339263 Dunham, LetaL Minus 32974634-32974452 
339265 Dunham, LeLai. Minus 32975943-32975806 

65 339338 Dunham, L eLaL Minus 33468728-33468606 

339396 Dunham, LeLal. Minus 34017306-34017205 

339400 Dunham, LetaL Minus 34045024-34044940 

339425 Dunham, LetaL Minus 34407911-34407798 
325207 6552430 Plus 140049-140170 
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329568 3982490 
329517 3983513 
325313 5866865 
325327 5866875 
S 325317 5866878 
325257 5866895 
329632 6729060 
325371 5866920 
325375 5B66920 
10 325378 5866920 

325469 6017034 

325470 6017034 
325576 6552443 
325505 6682451 

IS 325543 6682452 
329835 5302817 
329636 5302817 
325593 5866992 
325675 5867014 

20 325704 5867028 
325682 6138923 
325785 6381957 
325666 6469822 
325818 6682490 

25 329777 6002090 
329768 6015501 
329759 6048280 
329731 6065783 
329687 6117856 

30 329676 6272128 
329667 6272129 

329669 6272129 

329670 6272129 
329641 6468233 

35 329791 6469354 
325826 5667048 
325829 5867052 
329888 6067149 
329893 6S25313 

40 329899 6563505 
325988 5867064 
325855 5867067 
325999 5867073 
326001 5867073 

45 325886 5867087 
325882 5867087 
325905 5867104 
325922 5867122 
325937 6867132 

50 325960 58S7147 
325961 5867147 

325838 6552452 

325839 6552452 

325840 6552452 
55 325844 6552453 

325870 6682492 
329984 4646193 
329976 4878063 
329935 6165200 

60 329916 6223624 
330021 6671889 
330024 6671908 
330028 6671908 
£6033 5867178 

65 326036 5867178 
326056 5867184 
326116 5867193 
326122 5867194 
326138 5867203 



Plus 36331-36750 

Minus 53197-53269 

Minus 27385-28192 

Plus 75189-75264 

Minus 156551-156649 

Plus 10867-10955 

Plus 192813-193017 

Minus 1035422-1035536 

Minus 1165503-1165810 

Minus 1187981-1188167 

Plus 286823-286991 

Plus 287570-287663 

Minus 137769-137894 

Minus 240852-240946 

PIUS 151873-152057 

Minus 62522-62622 

Minus 64959-65078 

Minus 469726-469860 
Phis -955517-955711 

PIUS 156198-156387 

Plus 370618-370763 

Plus 61849-62003 

Plus 16769-16857 

Minus 120278-120559 

Minus 191389-191479 

Pius 118315-118422 

Minus 37647-37730 

Plus 158772-158900 

Minus 22165-22288 

Minus 142207-142359 

Plus 101355-101745 

Plus 131223-131291 

Phis 131351-131495 

Minus 105995-106107 

Minus 131982-132089 

Minus 46361-46458 

Plus 232674-233060 

Menus 37227-37473 

Minus 166123-166791 

Minus 111058-111783 

Phis 17349-17606 

Plus 276141-276251 

Pius 149115-149192 

Plus 155223-155348 

Plus 194694-194915 

Minus 8178-8347 

Phis 78779-78876 

Minus 329063-329134 

Minus 152633-152902 

Minus 162506-162635 

Minus 165106-165209 

Plus 17145M71532 

PIUS 181984-182037 

Plus 184380-184547 

Minus 14188-14332 

Plus 228209-228297 

Minus 139780-139890 

Minus 62584-62691 

Minus 6905949127 

Plus 36396-37195 

Phis 120938-121032 

Minus 1005-1270 

Minus 30015-30144 

Phis 37261-37333 

Minus 120215-120273 

Minus 181553-181690 

Plus 4554845604 

Plus 144397-144683 

Minus 179374-179436 
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326145 5867204 


Minus 


52599-52814 


326180 5867211 


Minus 


182758-183222 


326201 5867216 


Minus 


166168-166959 


326207 5867222 


Pius 


4813*48219 


326226 5867230 


Plus 


52644-52705 


326233 5867232 


Plus 


124788-124863 


326238 5867260 


Plus 


64282-64338 


326241 5867260 


Minus 


181648-181916 


326243 5867261 


Plus 


123838-123978 


326251 5867263 


Minus 


82716-82822 


326268 5867267 


Plus 


122114-122765 


326124 5916395 


Plus 


407102-407560 


326339 6056311 


Minus 


164637-165251 


330049 4567182 


Minus 


314662-315210 


326358 5867293 


Plus 


9122-9195 


326365 5867297 


Minus 


96630-96764 


326379 5867327 


Plus 


32299-32402 


326382 5B67327 


Minus 


50420-50503 


326390 5867340 


Minus 


108814*110592 


326424 5867369 


Minus 


168329-168409 


326453 5867399 


Plus 


86222-86423 


326472 5867404 


Plus 


293739-293940 


326492 5867422 


Plus 


120768-120991 


326533 5867441 


Minus 


.532153-532280 


330117 6015201 


Minus 


7340-7680 


330115 6015202 


Plus 


11403-11677 


330116 6015202 


Plus 


12109-12418 


330095 6015278 


Plus 


15343-15814 


330096 6015278 


Plus 


4937049458 


326644 5B67559 


Plus 


42684-42819 


326713 6667595 


Plus 


121511-121798 


326745 5867611 


Plus 


127130-127318 


326752 5867615 


Minus 


1214-1562 


326753 5867616 


Plus 


12454-12511 


326598 5867634 


Plus 


68955-69014 


326667 6552455 


Plus 


142311-142441 


326855 6552460 


Minus 


111390-111463 


326812 6682504 


Pius 


189811-189941 


327005 5867664 


Plus 


610847-610907 


327008 5867664 


Plus 


928737-928811 


326896 5867680 


Minus 


12032-12122 


326904 5867684 


Minus 


9280-9606 


326951 6004446 


Plus 


193812-193998 


326941 6004446 


Plus 


62018-62896 


326943 6004446 


Minus 


89242-89427 


326928 6456782 


Minus 


291007-291219 


326958 6469836 


Minus 


42952-43082 


326959 6469836 


Minus 


43159-43301 


327039 6531965 


Plus 


694486494998 


327127 6682520 


Plus 


41925-42083 


330158 6580367 


Plus 


8196642456 


327204 5867447 


Pius 


165135-165239 


327208 5867447 


Phis 


180805-180864 


327266 5867462 


Minus 


8240042615 


327277 5867473 


Minus 


165616-165715 


327289 5867481 


Plus 


4929649536 


327296 5867492 


Plus 


7627-8166 


327237 5867544 


Minus 


59702-59813 


327145 5867548 


Minus 


40482-40551 


327333 5902477 


Minus 


141448-141609 


327335 5902477 


Minus 


142979-143124 


327343 6017017 


Minus 


122B8-12395 


327350 6249563 


Minus 


4189041985 


327358 6552411 


Minus 


3802*3950 


327360 6552411 


Minus 


62554422 


327409 5867750 


Minus 


52949-53011 


327424 5867751 


Plus 


160442-160598 


327430 5867754 


Plus 


1320-1403 


327470 5867772 


Plus 


150910-150973 
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327460 6004455 
327498 6017023 

327509 6117815 

327510 6117815 
327512 6117815 
327535 6525279 
330163 6042042 
330171 6648220 
327579 5867824 
327672 5867843 
327629 5867872 
327640 5867890 
327649 5867899 
327612 6525283 
327716 6525284 
327801 5867924 

327762 5867961 

327763 5867961 
327776 5867964 
327622 5867968 
327823 5B67968 
327807 5687968 
327845 6531962 
330228 6013527 
330190 6165182 
328122 5868031 
328132 5868038 
328159 5868065 
328168 5868071 
328175 5668073 
328217 5868096 

327865 5868130 

327866 5868131 
327870 5868131 
327679 5868142 
327902 5868156 
327918 586B165 
327934 5868184 
327959 5868210 
327976 5868212 
328020 5902482 
326042 5902482 
326008 5902482 
330301 2905862 
330299 2905881 
328274 5868219 
32 85 95 5668224 
328591 5868227 
328668 5868254 
328677 5868256 
328687 5868262 
328706 5868270 
328711 5868271 
328730 5868289 
328732 5868289 
328734 5868289 
328752 5868298 
328755 5868301 
328761 5868302 
328775 5868309 
328784 5868309 
328787 5868309 
328809 5868327 
328829 5868337 
328280 5868352 
328311 5868371 
328318 5868373 
328323 5868373 
328348 5868383 



Plus 


175245-175343 


Minus 


42178-42283 


Minus 


54882-55053 


Minus 


56824-56944 


Plus 


176256-176325 


Pius 


19105-19175 


Minus 


20321-20385 


Plus 


110889-111575 


Minus 


37229-38335 


Minus 


6964949740 


Phis 


4959249811 


Plus 


9448-9566 


Plus 


205871-205927 


Plus 


2747-2924 


Plus 


8612346186 


Pius 


23239-23348 


Minus 


50303-50439 


Plus 


229347-229476 


Minus 


164308-164486 


Minus 


168886-169633 


Minus 


170359-170433 


Plus 


33745-33811 


Plus 


193402-193549 


Minus 


.37193787 


Phis 


3610336243 


Plus 


158474-158656 


Minus 


126737-126839 


Minus 


52957-53162 


Plus 


60321-60479 


Plus 


208-271 


Minus 


3742-4382 


Plus 


61503-62205 


Minus 


28933046 


Plus 


5355843757 


Minus 


77722-77793 


Minus 


133339-133467 


Plus 


547530447591 


Plus 


4183042036 


Minus 


46497-46682 


Minus 


349301-349409 


Minus 


556386456652 


Minus 


1985085-1986626 


Plus 


296663-297151 


Minus 


44204781 


Minus 


1020-1382 


Minus 


3124441439 


Plus 


148738-148967 


Minus 


237647-237726 


Minus 


10888-10984 


Minus 


5870848950 


Plus 


624479424585 


Pius 


165501-165614 


Minus 


97797-97990 


Plus 


80684214 


Plus 


3743747550 


Phis 


50559-50747 


Minus 


114911-115087 


Minus 


145959-146446 


Minus 


239308-239412 


Pius 


12845-12920 


Minus 


74523-74604 


Plus 


135772-135963 


Plus 


91792-91849 


Plus 


3630946630 


Plus 


160563-160631 


Minus 


170560-170826 


Plus 


414945415620 


Minus 


1080089-1080235 


Minus 


260272-260379 
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328377 5868390 

328436 5868417 

328504 5868471 

328506 5868471 

328522 5868477 

328525 5868482 

328541 5868486 

328662 6004473 

328663 6004473 
328803 6004475 
328304 6004478 
328927 5868500 
328936 5868500 
328939 6004481 
328941 6456765 
328948 6456765 
328968 6456775 
330316 6007576 

330350 3056622 

330351 3056622 
330348 4544475 
329(04 5868561 
329046 5868569 
329053 5868574 
329186 5868711 
329237 5868729 
329276 5868762 
329333 5868806 
329376 5868859 
329384 5868869 
329140 6017060 
329317 6381976 
328319 6381976 
329129 6588026 
329373 6682537 
329412 6682553 
329424 5868879 
329446 5868886 
329449 5868886 



Plus 16947-17023 

Phis 203760-203904 

Plus 4706447217 

Plus 60716-60830 

Pius 1972307-1972452 

Plus 12387-14313 

Plus 130956-131050 

Plus 1184773-1164855 

Plus 1185279-1166634 

Minus 291716-291948 

Minus 3884-3952 

Minus 428829428893 

Minus 1352202-1352259 

Minus 131139-131320 

Minus 9817-9885 

Plus 28227-28413 

Pius 117442-118283 

Minus 119761-119931 

Minus 26413-26820 

Minus 27522-27614 

Minus 19855-19962 

Minus 32819-32939 

Plus 18971-19030 

Pius 426453426541 

Minus 13108-13225 

Pius 133238-133339 

Minus 222629-222709 

Pius 392666-392746 

Pius 52356-52694 

Minus 116524-116662 

Plus 290842-290905 

Pius 614823415209 

PIUS 721390-721470 

PIUS 144569-144712 

Minus 38950-39301 * 

Minus 6894849041 

P(US 362196-362344 

Pius 8477644899 

Plus 97697-97771 
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TABLE 14: shows genes, including expression sequence tags, down-regulated in prostate 
tumor tissue compared to normal prostate tissue as analyzed using Affymetrix/Eos Hu02 
GeneChip array. Shown are the ratios of "average" normal prostate to "average" prostate 
cancer tissues. 



ExAccn: Exemplar Accession number, Genbank accession numbe 

UnigenefO: Unigene number 

UnigeneTOe: Unigene gene titia 

R1: Background subtracted normal prostate : prostate tumor tissue 

Pkey ExAccn UnlgendD Unigene Title R1 

331328 AA281133 Hs58808 ESTs 18-53 

320875 D60641 Hs/131921 ESTs 1455 

300994 AI2S1936 Hs.148298 ESTs 12.17 

323461 AA418762 Hs.190044 ESTs A 1055 

301015 AA947682 Hs217173 ESTs; Weakly similar to Chain A; Cdc42hs-Gdp Complex [Rsapiens] 10.17 

319419 AA543098 Hs.13848 ESTs; Highly similar to mftogen-induced [MjttuscuIus] 92 

323486 C05278 Hs.188800 ESTs; Moderately similar to [PYRUVATE 0EHYDR0GENASE(UPOAMIDE)] 

KINASE ISOZYME 4 PRECURSOR [H^apisns] 8.87 

324882 AW4190B0 Hs250645 ESTs 8 

330569 U57786 Hs57679 zinc finger protein 192 7.88 

330126 CH21_p2gi|6093735 75 

316265 AA737400 Hs.142230 ESTs 7.7 

323045 AA148950 Hs.186836 ESTs 7.64 

320668 R58399 Hs.146217 ESTs 7A 

330769 AA465192 Hs.16514 ESTs 7.15 

312614 AI766732 Hs201194 ESTs 7 

314790 AW341754 Hs.189305 ESTs 6.83 

309979 AW452118 Hs257533 EST 6.74 

314236 AA743396 Hs.189023 ESTs 6.49 

329192 CHX_hsgi|5868716 6.1 

324307 AA627642 Hs.4994 transducer of ERBB2; 2 (T0B2) 5.99 

303685 AW500106 EST duster (not In UniGene) with exon hit 552 

314921 AW452382 Hs257564 ESTs 55 

315840 AA679001 Hs.192221 ESTs 558 

332776 AA034364 Hs256551 ESTs; Weakly similar to HI! ALU CLASS B WARNING ENTRY fill [H-sapiens] 5.43 

313533 AW298141 Hs.157975 ESTs 54 

303494 F30712 EST duster (not in UnlGene) with exon hit 555 

317490 AI627356 Hs.148367 ESTs 551 

332546 D84454 Hs21899 solute carrier family 35 (UDP-galactose transporter); member2 525 

334719 CH22J=GENES.421_30 525 

300679 AA813958 Hs207727 ESTs; Moderately similar to KIAA0071 [UsaplensJ 522 

311811 A1625304 Hs.190312 ESTs 522 

315310 AW511298 HS256067 ESTs 5.19 

312871 H86747 Hs227602 WAA11 16 protein 5.11 

324715 A1739168 EST duster (not in UniGene) - 4.97 

313870 AW206435 Hs.146057 ESTs 457 

321453 N50080 Hs.1 17827 ESTs 4.78 

316160 AW197887 Hs253353 ESTs 453 

313833 AA766825 EST duster (not in UniGene) 458 

315850 AW270550 Hs.1 16957 ESTs 453 

303124 AF161350 EST duster (not in UniGene) with exon hit 4.46 

323346 AL134932 Hs.143607 ESTs 4.4 

301383 AA913591 Hs.12B480 ESTs 455 

324513 AW501678 Hs.164577 ESTs 428 

303480 AA331906 EST duster (not in UnlGene) with exon hit 425 

323591 AA301270 EST cluster (not in UnlGene) 422 

313603 AW468119 EST duster (not in UniGene) 42 

317863 AI733395 Hs.129124 ESTs 4.1 

312381 R42049 Hs.1 95473 ESTs 4.08 

317514 AW451570 Hs.126850 ESTs 4.03 

319750 AA621606 Hs.117856 ESTs 4.03 
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fl^fi gQ T55958 
314754 AW026761 
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318473 
307848 
300730 
303034 
324668 
324674 
300547 
316100 
314801 



313188 
314187 
311826 



AI364166 

AW449204 

W60843 

AI679131 

AA541323 

N53442 

AW2Q3966 

AA481027 

D59945 

A1039702 



311441 
321914 
332216 
324771 
323691 



309709 
300038 



313029 
304356 
314610 
329815 
314949 
300598 
329218 
315706 
303751 



321414 
312187 
334061 
336036 
321477 
315760 
316733 



323611 
314138 
316774 
308884 
331317 
317221 



AA765470 

D81150 

Z38720 

AA011603 

H95082 

AA631739 

AA317561 

AW516519 

AW242630 

AI088192 
AA731520 
AA196027 
AI948688 

AI745387 
N53574 

AW440742 

AW503637 

A1347274 

AA324975 

AA700439 



H67818 

AW139383 

AA811713 

AW235248 

AA304986 

AA740616 

AA814859 

A1833131 

AA258222 

AI989538 

AA749062 



321040 
308828 
300778 
316667 
324614 
316468 
300571 
314301 
312335 
322957 
316848 
313473 
318518 
313383 



304257 
309917 
319661 



AI824829 

AA236233 

AW015940 

AW503101 

AW293046 

AI239706 

AW297967 

AW043620 

AA247755 

AA830053 

AA009660 

T27119 

A1076370 

AA458637 

AA053294 

AW340014 

H08035 



Hs.134374 
Hs208973 
Hs.146863 

H&257125 

HS31570 

Hs.201424 

Hs.1 15831 

Hs.143443 

Hs2130G3 

Hs.127336 

Hs.179573 
Hs.1 18920 
Hs.122826 

Hs.151014 

Hs.102332 



Hs.1 15130 



Hs.135474 
Hs.170504 
Hs.195188 
HS.191805 

H&239124 
Hs.158932 

Hs.1 55556 



Hs.128993 
Hs.188490 



H&222059 
Hs245437 
Hs.1 63222 
Hs.79828 
HS.145704 



Hs.179100 
HsX7757 
Hs.191074 
Hs.180285 



Hs.188716 
Hs232234 

H&255158 
Hs.189886 
H&18B181 
Hs236993 

Hs.126798 
HS251948 

Hs.134037 
Hs.1 52207 



HS21398 



EST cluster (not In UnlGene) 

ESTs 

ESTs 

ESTs 

EST singleton (not in UnlGene) with exon hit 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs; Weakly similar to ORF YGR245C [Sxerevlsiae] 

EST cluster (not In UnlGene) 

collagen; type I; alpha 2 

ESTs 

ESTs 

EST cluster (not tn UniGene) with exon hit 
ESTs 

EST duster (not in UniGene) 
EST 

EST cluster (not in UnlGene) 
EST duster (not in UniGene) 
ESTs 

EST singleton (not in UniGene) with exon hit 
AFFX control: MurlL4 

ESTs; Weakly similar to ATP-DEPENDENT RN A HBJCASE A [H .sapiens] 
ESTs 

glyceraldehyde-3-phosphate dehydrogenase 
ESTs 



ESTs 
ESTs 

CHJLhsgi|5868726 
ESTs 

EST duster (not in UniGene) with exon ha 

EST singleton (not in UniGene) with exon hit 

ESTs; Weakly similar to KIAA0465 protein [Haptens] 

ESTs 

CH22J=GENES327J4 

CH22_FGENES.678_7 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs 

EST cluster (not in UnlGene) 

EST duster (not in UniGene) 

ESTs 

ESTs 

ESTs 

ESTs 

EST duster (not in UniGene) 

EST singleton (not in UniGene) with exon hit 

ESTs 

ESTs 

EST duster (not in UnlGene) 

ESTs 

ESTs 

ESTs 

ESTs 

EST duster (not in UniGene) 
ESTs 

ESTs; Moderately similar to T07D3.7 [Celegans] 

EST duster (not in UniGene) 

ESTs 

ESTs 

EST singleton (not in UniGene) with exon hit 
EST singleton (not in UniGene) with exon hit 
ESTs; Moderately similar to PUTATIVE GLUCOSAMINE-6-PHOSPHATE 
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4 
4 
4 

3.96 

3.95 

3.94 

353 

3X 

3.88 

083 

3.79 

3.75 

3.74 

3.73 

3.73 

3.7 

3.68 

3.66 

359 

332 

35 

3.49 

347 

3.46 

338 

336 

335 

334 

333 

3.32 

331 

33 

128 

328 

325 

325 

325 

325 

323 

323 

321 

32 

32 

32 

3.19 

3.17 

3.16 

3.11 

3.1 

3X8 

3X8 

3X8 

3.08 

3X7 

3X7 

3.07 

3.07 

3.06 

3.05 

3.03 

3X1 

3X1 

2.99 

2X8 

2X7 

2X6 

2X5 

2X5 
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317672 
323416 
312652 
324094 
319761 
317013 
317383 
314659 
312479 



311624 
321992 
316074 



312071 
312684 



322139 
304168 



ISOMERASE [H.sapiens] 
321253 A1699484 EST cluster (not in UniGene) 

321193 AA149508 Hs. 103288 ESTs 
332864 CH22.FGENES.28_4 
300027 

M11507 AFFX control: transferrin receptor 

324330 AA884766 EST cluster (not in UniGene) 

320014 AA137114 Hs.170291 ESTs 
333916 CH22_FGENES296_5 
318885 Z43272 EST cluster (not in UniGene) 

318146 W040125 Hs.150521 ESTs 
323348 AA233G56 Hs.191518 ESTs 
305703 AA825148 H&21229 F-box protein Fowl b 
CH22_FGENES329_7 
AW205409 Hs.127748 ESTs 
AI610397 Hs.159560 ESTs 
A1419909 Hs.160994 ESTs 
AA382603 EST cluster (not in UniGene) 

R84237 EST duster (not in UniGene) 

AA864468 Hs.135646 ESTs . 
AA913887 Hs.126511 ESTs 
AW277121 H&254881 ESTs 

AI950844 Hs.128738 ESTs; Weakly similar to non-lens beta gamma-crystallin like protein [Rsapiens] 

CH22.FGENES7J0 
AW293826 H&250610 ESTs 
C06003 H&116456 ESTs 
AW517542 H&208382 ESTs 

AW296076 EST singleton (not in UniGene) with exon hit 

AA683529 Hs.143119 ESTs 
AW294020 Hs.117721 ESTs 

AA062971 Hs.181161 ESTs; Weakly simitar to INHIBITOR OF APOPTOSIS PROTEIN 1 [M.musculus] 
H53744 EST cluster (not in UniGene) 

H77679 EST singleton (not in UniGene) with exon hit 

CH.13_hsgI|5866994 
R59096 Hs.136698 ESTs 

N75450 EST duster (not In UniGene) with exon hit 

AA831215 Hs.159066 ESTs; Weakly similar to predicted using Genefinder [Cetegans] 
A1091458 Hs.134559 ESTs 

R38715 Hs.153529 Homo sapiens clone 24540 mRNA sequence 
AI823999 Hs. 162000 ESTs 

AA614308 EST singleton (not in UniGene) with exon hit 

AI431345 Hs.161784 ESTs 
AW193466 Hs.136525 ESTs 
A1057369 Hs.122536 ESTs 
AA135565 H&250739 ESTs 
Hs.156939 ESTs 

H&255738 ESTs; Moderately similar to gag [H^apiens] 
Hs255074 ESTs; Moderately simaar to high-risk human papilloma viruses E6 
oncoproteins targeted protein E6TP1 alpha [ttsapiens] 
EST cluster (not in UniGene) 
CH22_DA59H18.GENSCAN.28-7 
EST cluster (not in UniGene) 
AI817933 HSJ209584 ESTs 
R06841 EST cluster (not in UniGene) 

AI248571 Hs.186837 ESTs 
AAB361 16 EST duster (not in UniGene) 

CH.19jsgi|5867435 
AW015506 Hs.130730 ESTs 

AF090948 EST duster (not in UniGene) with exon hit 

H24244 HS240763 ESTs; Weakly similar to /prediction 
AI209108 Hs.143946 ESTs 

CHJUisgI)5868728 
328018 CH36_hsgi|5902482 
323231 AA324437 Hs.177230 ESTs 
312887 AW157377 Hs.132910 ESTs 
315183 AW136134 H&220277 ESTs 
300259 AI479011 Hs.170783 ESTs 
313240 AI743261 Hs.131860 ESTs 
316697 AW293174 H&2526Z7 ESTs 
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319885 
300611 
316854 
318208 
331623 
324616 



314912 
300767 
313463 
320600 
301160 
324825 
300336 

317850 
339047 
324580 
321142 
31947B 
300793 
313733 
326505 
314987 
303114 
318709 
312878 



AA704457 
AW292417 

N29974 



2.95 
253 
233 
2.92 

2.91 

238 

238 

238 

237 

237 

235 

234 

233 

232 

231 

231 

231 

23 

23 

2.78 

2.76 

2.77 

2.75 

2.75 

2.73 

2.73 

2.73 

2.73 

2.72 

2.72 

2.72 

2.72 

2.71 

2.71 

2.71 

2.69 

2.68 

2.68 

238 . 

237 

237 

237 

235 

235 

235 

235 

234 

234 

2.64 

2.63 

2.62 

2.62 

231 

23 

2.6 

2.6 

239 

236 

237 

236 

236 

235 

235 

235 

234 

234 
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313966 


AI807551 


Hs.189061 


ESTs 


253 


331263 


AA015718 




ze3ta12.s1 Soares retina N2MHR Homo sapiens cDNA clone 










IMAGE36574 3\ mRNA sequence . 


251 


310683 


AW055233 


Hs.160870 


ESTs 


2S 


302566 


AA085996 


Hs£48572 


Human PAC clone DJ404F18 from Xq23 


IS 


302697 


AJ0014O8 




EST cluster (not in UniGene) with exon hit 


2£ 


308362 


AI613519 




EST singleton (not in UniGene) with exon hit 


2.49 


322347 


AF086538 




EST cluster (not in UniGene) 


2.49 


316240 


AA974253 


Hs.120319 


ESTs 


2.49 


323208 


AA203415 


Hs.136200 


ESTs 


2.48 


321643 


W76005 


Hs.32094 


ESTs 


2.48 


330723 


AA243617 


Hs31082 


ESTs; Highly similar to db83 (Rjiorveglcus] 


2.48 


323455 


AA256675 


H&200438 


ESTs; Weakly similar to atypical PKC specific binding protein [RjiorvegicusJ 


2.47 


308383 


AI624497 




EST singleton (not in UniGene) with exon hit 


247 


328744 






CH.07_hsgi|5868290 


2.47 


332344 


W45574 


Hs.252497 


ESTS 


247 


328121 






CH.06_hsgi|5868Q31 


2.47 


321915 


AI670955 


HS200151 


ESTs 


246 


314954 


AA521381 


Hs.187726 


ESTs 


245 


302821 


AA188868 


Hs.173933 


ESTs; Weakly similar to NUCLEAR FACTOR 1/X [H^apiens] 


245 


329454 






CH.YJsgI|5868887 


245 


336605 






Ott2_.FGENES.420J 


245 


300664 


AI444628 


Hs.256809 


ESTs 


244 


323362 


AL135067 


Hs.1 17182 


ESTs 


2.44 


300024 


M10098 


AFFX control: 18S ribosomal RNA 


2.44 


325026 


AI671168 


Hs.12285 


ESTs 


243 


324510 


AI148353 


Hs.120849 


ESTs 


243 


313389 


AI765182 


Hs.1 19903 


ESTs 


243 


301309 


M78276 


Hs255917 


ESTs 


243 


313570 


AA041455 


H&209312 


ESTs 


243 


316504 


AW135854 


Hs.1 32458 


ESTs 


242 


319401 


R01342 




EST duster (not in UniGene) ' 


2.42 


312827 


AI744361 


HS205591 


ESTs; Weakly similar to zinc finger protein Png-1 [M.muscuius] 


2.42 


327871 






CK06J>sgI|5868131 


241 


337173 






CH22.FGENES.565-3 


241 


302948 


AA465635 




EST cluster (not in UniGene) with exon hit 


241 


324303 


AL1 18754 




EST cluster (not m UniGene) 


24 


315527 


AI791138 


Hs.116768 


ESTs 


24 


315979 


AA830515 


H&222917 


ESTs 


24 


331310 


AA253351 


Hs.44439 


STAT induced STAT hhibttor-4 


24 


321095 


AA017595 


H&32844 


ESTs 


24 


308561 


AI701559 




EST singleton (not in UniGene) with exon hit 


239 


313035 


N36417 


H&144928 


ESTs 


237 


322114 


AA643791 


Hs.191740 


ESTs 


231 


313671 


W49823 


Hs.145553 


ESTs 


231 


303211 


AA099548 


Hs.191436 


ESTs; Highly similar to dJ1 11BD24,4 [Usapiensj 


237 


301256 


AA932948 




EST cluster (not in UniGene) with exon hit 


£36 


338165 






CH22 _EM^C005500.GENSCAN212-3 


2.36 


324692 


AA557952 




EST cluster (not in UniGene) 


235 


318587 


AA779704 


Hs.168830 


ESTs 


235 


312378 


R41582 


Hs.109219 


retina} degeneration B beta 


2-35 


318625 


T48446 


Hs.193162 


ESTs 


235 


305181 


AA663726 


Hs.1 16922 


EST 


235 


300815 


AA286678 




EST duster (not in UniGene) with exon hit 


234 


324063 


AW292740 


Hs.254815 


ESTs 


234 


315859 


AA682305 


Hs.1 33268 


ESTs 


233 


305092 


AA642912 




EST singleton (not in UniGene) with exon hit 


233 


306598 


AI000320 




EST singleton (not in UniGene) with exon hit 


233 


300307 


AI651016 


H&246311 


ESTs 


233 


321348 


249979 




EST duster (not in UniGene) 


233 


325112 


AI90377O 


Hs.124344 


ESTs 


232 



321363 AJ0Q2574 



300680 AW468066 
327120 

302761 AW250553 
312132 AI475490 
315639 AA827652 



CH22LFGENES43-7 

EST duster (not in UniGene) 

CH2^FG0JES.73M 
Hs-257712 ESTs; Weakly simitar to K1AA0986 protein (H .sapiens) 

CH^1_hsg?6531970 

EST duster (not in UniGene) with exon hit 
Hs.170577 ESTs 

EST duster (not in UniGene) 



232 
232 
231 
231 
2.31 
23 
23 
23 
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312189 T95594 Hs.187435 ESTs 2.3 

306537 AA991705 EST singleton (not In UniGene) with exon hit 2.3 

327061 CK21_hsgi|6531965 25 

315391 AA759098 Hs.192007 ESTs 22 

322384 AI968646 Hs.33862 ESTs 259 

323206 AA203339 HS.22075G ESTs 259 

318110 AI680915 Hs501379 ESTs 258 

335250 CH22_FGENESi16J1 258 

331696 Z38907 Hs.91662 KIAA0888 protein 258 

3183Z7 AW294013 HS500942 ESTs 258 

324980 AA969121 Hs554296 ESTs 258 

319429 A1608881 Hs.11482 ESTs; Highly similar to junctional adhesion molecule [H^aplans] 258 

310601 AI970543 Hs.1 92605 ESTs 258 

318905 Z43395 EST cluster (not in UnlGene) 258 

323442 AA252753 Hs.1 64039 ESTs 257 

304428 AA342250 Hs.99819 ubiquitin specific protease 16 257 

313352 AW292127 Hs.1 44758 ESTs 257 

316491 AA766025 Hs538794 EST 257 

317751 AI697668 H&202241 ESTs 256 

314136 AA229781 H&221962 ESTs 256 

306665 AI004614 Hs. 130577 EST 256 

303946 AW474196 H&221604 ESTs 255 

313435 M769123 EST cluster (not in UniGene) 255 

317679 AA968799 Hs.1 50289 ESTs 255 

322370 AA330095 EST duster (not in UniGene) 255 

306620 AI000929 EST singleton (not in UniGene) with exon hit 254 

329109 CHJLhsgi|5868626 254 

311043 AI871209 Hs.177128 ESTs 254 

300228 AI458372 Hs.158748 ESTs; Weakly similar to synapstn lb [Mmusculus] 254 

307223 A1193698 Hs.1 84776 ribosoma! protein L23a 254 

309023 AI888045 EST singleton (not In UniGene) w&h exon hit 253 

310749 A1493675 Hs.170332 ESTs 253 

316769 A1914939 Hs512184 ESTs 252 

320409 AA356195 EST cluster (not in UniGene) 251 

333149 CH22_FGBIES.87_8 251 

324951 M86125 Hs.137487 ESTs 251 

321939 AI791617 Hs.145068 ESTs 25 

320594 AI863952 Hs,169436 arginyttransf erase 1 25 

320722 R67430 Hs.172787 ESTs 25 

321781 D78667 EST cluster (not in UniGene) 25 

328903 CH.08Jisgi|5868514 25 

303889 T19204 EST duster (not in UniGene) w&h exon h& 25 

325045 T08845 EST duster (not in UniGene) 25 

312828 A1865455 Hs511818 ESTs; Moderately similar to W ALU SUBFAMILY J WARNING EMTRY till (H.sapiens] 2.19 

335109 CH22JGENES.494J5 2.18 

330878 AA131471 Hs.71440 ESTs 2.18 

311289 A1971362 Hs531945 ESTs 2.18 

304608 AA513456 EST singleton (not in UniGene) with exon hit 2.18 

337393 CH22JGBIESJ47-4 2.18 

332812 CH22JGENES7J4 2.18 

327665 CH.04_hsgi|5867839 2.18 

314581 AW504859 Hs537849 ESTs 2.17 

326508 CH.19_hsgi|6682496 2.17 

301242 AW161535 Hs558803 ESTs 2.17 

312780 AI765651 Hs.17290O ESTs 2.17 

315954 AW276810 Hs554859 ESTs 2.16 

311179 AJ880843 Hs523333 ESTs 2.16 

315320 AI084182 Hs.186895 ESTs 2.16 

313017 AI015203 Hs.118015 ESTs 2.16 

312430 AW139117 Hs.117494 ESTs 2.15 

300864 AA406539 Hs.190958 ESTs 2.15 

314753 AA463262 EST duster (not in UniGene) 2.15 

322574 AF156548 EST cluster (not in UniGene) 2.15 

321409 C03864 EST cluster (not in UniGene) 2.15 

321205 AA002047 EST duster (not in UniGene), 2.14 

320406 AA353895 Hs.1 52983 HUS1 (S. pombe) checkpoint homoSog 2.14 

337646 CH22_EM^C000097.GENSCAN.11-2 2.13 

303084 AF174008 EST duster (not in UniGene) with exon hit 2.13 

312185 AA654772 Hs.1 86554 ESTs 2.13 
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306813 A106B544 EST singleton (not in UniGene) with exon hit 2.13 

31446S AA602917 Hs.156974 ESTs 2.12 

318168 AI8217B2 HsJ22Q587 ESTs; Moderately similar to III! ALU SUBFAMILY SC WARNING ENTRY HU [H^apiens] 2.12 

315990 A1800041 Hs.190555 ESTs 2.11 

320712 R66867 EST cluster (not in UnlQane) 2.11 

318487 AI167877 Hs.143716 ESTs 2.11 

317462 AW015206 Hs.178784 ESTs 2.11 

304384 AA235482 Hs.62954 ferritin; heavy polypeptide 1 2.11 

314544 AA39901B Hs250835 ESTs 2.1 

319881 T72744 EST cluster (not in UniSene) 2.1 

32807B CH.06.hsg?5868008 2.1 

317354 AW09O77O Hs.192271 ESTs 2.1 

308617 AI738720 EST singteton (not in UniGene) with exon hft 2X9 

311568 AW439969 H&218177 ESTs 2.09 

313605 AI761786 H&204674 ESTs 2.09 

314269 AA848118 H&221216 ESTs 2.08 

332933 CH22_FGENES.38_7 2j08 

325498 CH.12_hsgi)5B66967 2.08 

313659 AW296067 Hs. 124 106 ESTs 2j08 

324596 AW149321 Ks.105411 ESTs 2.08 

324783 AAS40770 EST duster (not in UniSene) 247 

302696 AA347452 EST duster (not In UniGene) with exon hit 2.07 

313418 AW450674 Hs.1 14696 ESTs 2.06 

326920 CH51Jsgi|6456782 £06 

327574 CH.03 hsgl|5867818 2X6 

323207 AI052795 Hs. 192201 ESTs 2X6 

303753 AW503733 Hs.170315 ESTs 2XJ5 

305235 AA670480 EST singleton (not In UniGene) with exon hft 2X5 

316055 AA693880 EST duster (not in UniGene) 2X5 

317194 AW445167 Hs.126036 ESTs 2X5 

319565 AW408683 Hs32922 ESTs 2.05 

335146 CH22.FGENES.499J 2X5 

301475 AI678183 Hs. 170917 prostaglandin E receptor 3 (subtype EP3) 2X4 

312442 AA120970 Hs.143199 ESTs 2X4 

322502 R62925 Hs£43665 ESTs 2X4 

303693 AA290875 H&30120 ESTs 2.04 

310179 AI215643 Hs.171381 ESTs 2X3 

321121 W23285 EST cluster (not in UniGene) 2X3 

331330 AA282197 Hs.89002 ESTs; Highly similar to CGI-07 protein [Ksapiens] 2.03 

306557 AA994530 EST singleton (not in UniGene) with exon hit 2X3 

317865 AI298794 Hs.129130 ESTs 2X3 

318667 AI493742 Hs.1 65210 ESTs 2X2 

318042 AW294522 H&149991 ESTs 2X2 

323818 AW245528 Hs. 134754 ESTs 2.02 

331286 AA137062 Hs.103853 ESTs 2X1 

311262 AI989942 HS232150 ESTs 2.01 

335601 CH22J=GENES£8L41 2X1 

311351 AI682303 Hs£01274 ESTs 2X1 

312996 AA249018 EST duster (not in UniGene) 2X1 

328190 CHX6J*gi|5668077 2 

338030 CH22_EMACO05500.GENSCAN.148-16 2 

333940 CH22_FGBJES.301_6 2 

328227 CH.06JlSQt|5868105 * 2 

331481 N27448 Hs.43944 EST 2 

335288 CH22.FGENES527J 2 

307513 A1274307 EST singleton (not in UniGene) with exon hit 2 

323316 AL134620 EST duster (not in UniGene) 2 

319479 R21945 H&J256153 ESTs 2 

303482 AA502583 Hs.197271 ESTs 2 

327489 CHX2_hsgi|6004459 1X9 

323935 AW175841 Hs.192183 ESTs 1X9 

309575 AW168096 Hs.195188 glyceraldehyde-3-phosphate dehydrogenase 1X9 

337043 CH22.FGENES.439-19 1X8 

312897 A1828174 Hs.227049 ESTs 1X8 

307881 AI370434 EST singleton (not in UniGene) with exon hit 1X8 

328656 CHX7Jisg1|6004473 1X8 

314569 AA813784 Hs.123001 ESTs 1.98 

332783 W45302 Hs.87889 heGcase*mol 1.98 

315259 AA701499 Hs.148115 ESTs 1* 
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313171 N67879 Hs.157695 ESTs 157 

310)60 AI241421 Hs.132236 ESTs 1.87 

332256 N66393 Hs. 102754 ESTs 1-97 

312110 A1962180 H&2268G3 ESTs 1.67 

335864 CH22_FGENES.629J 1.97 

320389 W00545 Hs.171785 ESTs 1-97 

314065 AA868267 H&85524 ESTs 156 

323086 H15474 Hs.12214 Homo sapiens done 23716 mRNA sequence 1-96 

323919 AA862973 H&220704 ESTs 1.96 

310750 A1373163 Hs. 170333 ESTs 1-96 

309435 AW090537 EST singleton (not in UniGene) with exon hit 1.96 

300129 AW028820 EST cluster (not In UniGene) wffl> exon hit 1.96 

320130 AI820675 H&203804 ESTs 1.95 

323787 AW373446 Hs.169885 ESTs; WeaWy similar to cDNA EST EMBUT02216 comes from this gens [Cetegans] 155 

338112 CH2^EM:AC005500.GENSCmi85-24 155 

313625 AW468402 Hs254Q20 ESTs 1.95 

325240 CH.10_hsgil5866B48 155 

331833 AA412102 Hs£50911 interieukin 13 receptor; alpha 1 1.95 

332252 N63882 za21&s1 Soares fetal Bver spleen 1 NFLS Homo sapiens cDNA clone 

IMAGE293225 3", mRNA sequence 155 

300279 AW237425 HS253817 ESTs 155 

326023 CR17_hsgi|5867245 155 

321609 H86021 Hs.198800 ESTs;WeaMy«artohMmTRA1b[H.sapiens] 1.94 

324183 AA4Q2453 Hs.113011 ESTs 154 

336276 CH22 FGENES.762J 154 

334913 CH22>GENES.456J 154 

325417 CH.1£_hsgi|5866925 154 

318489 AW043590 Hs.225023 ESTs 154 

318455 AI148763 EST cluster (not m UniGene) 154 

306890 A1092235 EST singleton {not in UniGene) with exon hit 1.94 

315073 AW452948 Hs.257631 ESTs 1-9* 

321289 R84687 Hs.226306 ESTs 1.94 

308521 AI689808 EST singleton (not in UniGene) with exon hit 153 

306382 AA368967 EST singleton (not In UniGene) with exon hit 1.93 

331320 AA262999 Hs.42788 ESTs 1.93 

324279 AA501412 Hs.191688 ESTs; Weakly similar to Pro-PoWUTPase poiypratein [M.musculus] 1.93 

309577 AW168753 EST singleton (not in UniGene) with exon hit 153 

327014 CR21_hsgl|5867664 1.93 

303488 AW025860 EST cluster (not in UniGene) with exon hit 153 

306561 AA995223 Hs.129559 EST 1.92 

330694 AA019606 Hs. 108447 spinocerebellar ataxia 7 (oGvopontocerebeDar atrophy with retinal degeneration) 152 

313083 N50545 Hs.159200 ESTs 152 

327752 CH55Jisgij5867949 152 

318674 AA295490 EST cluster (not in UniGene) - 152 

301267 AW297762 Hs255680 ESTs 1*1 

332092 AA608787 Hs.112590 ESTs 151 

323509 AL036947 EST cluster (not in UniGene) 151 

321452 AA317554 EST cluster (not in UniGene) 151 

311483 AI765013 Hs.209128 ESTs 151 

300976 AJ246374 H&185861 ESTs 151 

323715 AA322155 EST cluster (not in UniGene) 151 

313800 AW286132 Hs.166674 ESTs 151 

332029 AA489697 Hs.145053 ESTs 151 

304013 AW518573 Hs.156110 Immunoglobufin kappa variable 1D-6 151 

322019 AA354549 Hs.41181 Homo sapiens mRNA; cDNA DKFZp727C191 (from done DKFZp727C191) 151 

334150 CH22_FGENES539_1 15 

310094 AW450967 Hjl235240 ESTs 15 

316218 AW207642 Hs.174021 ESTs 15 

324774 AI031771 Hs.132586 ESTs 15 

326507 CH.19_hsgl|5867435 15 

314570 AA405696 EST cluster (not in UniGene) 15 

336268 CH22_FGENES.758_2 15 

315278 AI985544 Hs.116429 ESTs 15 

325824 CH.15_hsgl|5867048 15 

316277 AA737780 Hs.213392 ESTs 15 

323181 AA418583 Hs.143621 ESTs 15 

301438 AA961643 Hs.127716 ESTs 159 

307050 AI147341 Hs.146734 EST 159 

306830 A10758Q3 EST singleton (not In UniGene) with exon hit 159 
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302426 AL049925 Hs.225984 DKFZP547G09 10 protein 1J89 

320127 H72615 Hs.17268 ESTs 139 

337736 CH22_EMAC000097.GBISCAN.10O-2 1.89 

331319 AA262755 Hs.194264 ESTs 138 

310767 AI377505 Hs.158835 ESTs 1.88 

314880 AI732169 Hs.105429 ESTs 138 

312539 AI004377 HsJ200360 ESTs 138 

309674 AW205604 Hs.168034 ESTs; WeaHy similar to !!!! ALU SUBFAMILY SP WARNING ENTRY till [Haptens] 1.88 

314621 AI627478 Hs.187670 ESTs 1.88 

319495 AI972146 Hs.192756 ESTs 138 

313472 AA007374 EST duster (not in UniGene) 138 

302705 U09060 EST cluster (not in UniGene) with exon hit 188 

329511 CH.10j)2gi|3983514 138 

317140 AI699412 H&201925 ESTs 137 

302598 AI815985 Hs.129683 ubiqultin-conjugating enzyme E2D 1 (homologous to yeast UBC4/5) 137 

301153 AA725670 Hs.120485 ESTs; Weakly similar to serine/threonine kinase with SH3 domain; leucine 

zipper domain and proline rich domain [Rsapiens] 1 37 

332222 N28271 Hs.176618 ESTs 137 

330703 AA055475 Hs.1 04143 dathrin; Cght poiypepfide (Lea) 137 

318470 Al 159863 Hs.1 43713 ESTs 137 

314014 AW291847 Hs.121715 ESTs; Weakly similar to HP protain [H sapiens] 137 

300370 AI827817 EST cluster (not in UniGene) with exon hit 136 

312329 R8476B Hs.13399 Homo sapiens clone 25032 mRNA sequence 138 

325587 CH.12_hs gi|6682462 136 

310237 AI884313 Hs.158906 ESTs 136 

318872 R13085 EST cluster (not in UniGene) 136 

303431 AA317915 EST duster (not in UniGene) wflh exon hit 136 

338427 CH22_EM^C005500.GENSCAN349-1 136 

300452 AB52293 Hs.191098 ESTs 135 

. 321279 H85330 Hs.146060 ESTs 135 

301690 F05865 H&249180 ublquffin^onjugating enzyme EE 2 (homologous to yeast UBC4/5) 135 

307932 AJ230822 EST singleton (not in UniGene) with exon hit 1.85 

318292 AJ679956 Hs.150603 ESTs 135 

310254 AJ239811 Hs.157491 ESTs 135 

311790 AW016437 H&233462 ESTs 134 

314248 AA278347 Hs.126078 ESTs 134 

335588 CH22_FGENES381_25 1.84 

3392)9 CH2£_F113D11.GENSCAN.64 134 

307954 AI419692 EST singleton (not in UniGene) with exon hit 1.84 

302549 AF055136 H&248162 tectorin alpha 1.84 

321629 H67213 H&158092 ESTs 1.84 

301239 AA807558 EST cluster (not in UniGene) with exon hit 134 

332434 N75542 Hs.75356 transcription tactor 4 134 

327192 CH.01_hsgi}5857445 133 

310214 A1220072 Hs.165893 ESTs 133 

320516 R33857 Hs.1 81 479 ESTs; Weakly similar to E-SELECTIN PRECURSOR [H.sapter\s] 133 

324231 W60827 EST duster (not in UniGene) 133 

336616 CH2*_fGENES313j5 1*83 

328799 CH.07Jisgi|5868316 133 

324661 AW504161 EST duster (not In UniGene) 133 

313190 AA766707 Hs.1 53039 ESTs 133 

301979 128168 Hs.121495 potassium voltage-gated channel; Isk-related family; member 1 132 

302099 AL021397 Hs.137576 ribosomal protein L34 pseudogene 1 * 132 

320187 T99949 EST duster (not in UniGene) 132 

320791 R78808 Hs33961 ESTs; Weakly similar to III! ALU CLASS A WARNING ENTRY !!!! [Rsapiens] 132 

305733 AA829535 Hs34298 CD74 antigen (Invariant polypept of rvWC; dass li antigen-associated) 132 

308280 AI569349 Hs.1 80920 ribosomal protein S9 131 

321533 W78877 Hs.40111 ESTs 131 

312946 A1915122 Hs304087 ESTs; Weakly similar to F33D1 1 3b [Cjetegans] 131 

319474 H90265 Ks. 100638 ESTs 131 

329519 CH.l0jJ2gJ|398351O 131 

324685 AA220982 EST duster (not in UniGene) 131 

320697 N62937 Hs.139181 ESTs 131 

329246 CHJC_hsgi|5858732 131 

332000 AA481271 Hs.193945 ESTs 131 

310811 AI420990 Hs.161303 ESTs U1 

325866 CH.16.hsgi|5867076 131 

322064 Z78343 EST duster (not in UniGene) 13 

333712 CH22.FGENES.25U 13 
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313457 AA576052 Hs.1 93223 ESTs 1-8 

321591 H85687 Hs.1 17927 ESTs 1.8 

330260 CHX5j)2g{|6671884 13 

311080 AI656320 Hs.197711 ESTs 13 

329522 CH.10_p2g(|3983507 13 

322889 AA0B1924 Hs2114t7 ESTs 13 

300175 AI275011 Hs£04877 ESTs 13 

330976 H20560 Hs.244624 ESTs 13 
300208 A1341180 Hs.1961 15 ESTs; Weakly similar to FIBRILLIN 1 PRECURSOR [Ksapiens] 1.79 
319635 R17531 EST cluster (not in UniGene) 1.79 
313454 AA730673 Hs.188634 ESTs 1.79 
303093 AI400310 Hs. 148958 ESTs 1.79 
309815 AW292760 ESTstngleton (notlnUniGene)wlthexon hit 1.79 
326506 CH.19JisglJ5867435 1.79 
319845 AA649011 Hs.187902 ESTs 1.79 
300290 AI623739 Hs.166387 ESTs 1.79 
312180 AI248285 Hs.118348 ESTs 1.79 
313058 D81015 Hs.125382 ESTs 1.79 
330120 CH.19_p2gil6671864 1.78 
328412 CH.07JsgI|58684O5 1.78 
302345 NML000565 EST duster (not to UniGene) with exon hit 1.78 
308100 AW75949 EST singleton (not in UnJGene) with exon hit 1.78 
311386 AW205705 HS207514 ESTs 1.78 
330282 CH.05j>2gi|6671910 1.78 
318856 Z43011 Hs^1169 ESTs 1.78 
312486 AA845630 Hs.1 17904 ESTs 1.78 
325450 CH.ia_hsgi|5B66941 1.78 
321206 H54178 Hs.226469 ESTs 1.78 

330977 H20826 Hs31783 ESTs 1.78 
303487 AA333666 EST cluster {not in UniGene) with exon hit 1.77 
310398 AI264671 H&164166 ESTs 1.77 
313230 AI540166 Hs. 129563 ESTs 1.77 
317747 AI683782 Hs.128245 ESTs 1.77 
303381 AL038841 Hs.163313 ESTs; Weakly sffnilar to UU ALU SUBFAMILY SB WARNING ENTRY HU [H^apiens] 1.7 
336123 CH22JGENES.70U 1.77 
300185 AI286182 H&208484 ESTs 1.77 
316002 AW451733 Hs.119824 ESTs 1.77 
319650 AA001811 Hs.83722 ESTs 1.77 
329941 CH.16j)2gil6165199 1.77 
328329 CH.07_hsgi]5868375 1.77 
322934 AI493054 Hs.1 58968 ESTs 1.77 
325902 CH.16_hsgI|5867101 1.76 
322239 W01813 Hs.12109 WD40 protein Ciaol 1.76 
303530 AE74851 Hs558744 ESTs 1.76 
300980 AI025527 H&222097 ESTs 1.76 
331909 AA437300 Hs.178210 ESTs 1.76 
321553 H92449 Hs.116406 ESTs 1.76 
301618 T52760 EST duster (not in UniGene) with exon hit 1.76 
319592 AA627356 Hs.163315 ESTs 1.76 
318511 T26528 Hs2Z7175 ESTs; Weakly similar to IIII ALU SUBFAMILY SQ WARNING ENTRY Hfl [H.sapiens] 1.7 
327183 CH.01_hsgi|5B67442 1.76 
313516 AA029058 Hs.135145 ESTs 1.76 
318644 AI752482 EST duster (not In UniGene) ' 1.76 
321632 AA419617 EST duster (not In UniGene) 1.76 
324657 AW451142 H&255628 ESTs 1.76 
300437 AW449374 Hsi57149 ESTs 1.75 
319775 AA504429 Hs.6211 rnetra>CpG binding domain protein 1 1.75 
314775 AI149880 Hs.188809 ESTs 1.75 
337460 CH22_FGBIES.780-5 1.75 
309849 AW297444 EST singleton (not in UniGene) with exon hit 1.75 
301471 AA995014 Hs.129544 ESTs; Weakly similar to ORF YLL027W [S.cerevlsiae] 1.75 
312739 AI318426 Hs.155925 ESTs 1.75 
319995 H15355 Hs.60887 ESTs 1.75 
326495 CH19.hsgi|5867423 1.75 
337497 CH2?_FGENES30M 1.75 
322633 AA004534 Hs.153981 ESTs 1.75 
332177 F10812 Hs.101433 ESTs 1.75 
326930 CH21_hsgi|8456782 1.75 
316893 AA837332 EST duster (not in UniGene) 1.75 

279 



WO 02/30268 



PCT/US01/32045 



324826 AA704806 Hs.143842 ESTs 1.75 

311269 AI656924 Hs.174257 ESTs 1.75 

309375 AWD75342 EST singleton (not in UniGene) with exon hit 1.75 

314171 AI821895 Hs.193481 ESTs 1.75 

311684 A1990741 H&252809 ESTfi 1.75 

334387 CH22LFGENES.380J 1.75 

312195 AI300101 Hs.252222 ESTs 1.75 

315707 AI418055 Hs.161160 ESTs 1.74 

324349 AW501470 EST duster (not in UniGene) " 1.74 

300724 AI762929 Hs.206134 ESTs; Weakly similar to similar to reverse transcriptase [Celegans] 1.74 

309906 AW339340 EST singleton (not in UniGene) with exon hS 1.74 

303714 AW501336 EST cluster (not in UniGene) wfih exon hit 1.74 

318704 Z24981 EST cluster (not in UniGene) 1.74 

303027 AF111178 EST cluster (not in UniGene) with exon hit 1.74 

322601 W92924 EST duster (not in UniGene) 1.74 

319382 H93199 Hs.33665 ESTs 1.74 

315858 AA737345 EST cluster (not in UniGene) 1.74 

332243 N55484 Hs22Q540 ESTs; Highly similar to ARYL HYDROCARBON RECEPTOR NUCLEAR 

TRANSLOCATOR [H^aptens] 1 .74 

330951 H02566 Hs.191268 Homo sapiens mflNA; cDNA DKFZp434N174 (from done DKFZp434N174) 1.74 

324044 AL045752 H&211519 ESTs 1.73 

320630 AA199847 EST cluster (not in UniGene) 1.73 

327288 CH.01Jisgi[5867481 1.73 

314986 AI201367 Hs.142860 ESTs 1.73 

319078 H17255 Hs.144515 ESTs 1.73 

326278 CH.17j»giI5867269 1.73 

302552 H49792 EST duster (not in UniGene) with exon hit 1.73 

322322 AF086431 EST duster (not in UniGene) 1.73 

327075 CR21_hsgi|6531865 1.73 

317392 AI797588 Hs.145459 ESTs 1.73 

300810 AKJ76890 Hs. 186949 ESTs 1.73 

315978 AA830893 Hs.1 19769 ESTs 1.73 

323903 M773580 Hs.193598 ESTs 1.73 

330803 AA004699 Hs.150580 putative translation tngiaton factor 1.73 

309845 AW296802 H&255580 EST 1.73 

314963 AI689617 H&200934 ESTs 1.73 

311710 F09774 Hs.175971 ESTs 1.73 

315315 AI984592 Hs.15088 ESTs 1.73 

300378 AA663560 H&235873 ESTs; Weakly similar to K1 1 C4.2 [Celegans] 1.73 

316141 AW303457 EST duster (not in UniGene) 1.72 

319826 T71739 Hs.75442 albumin 1.72 

312961 A1033922 Hs.122517 ESTs 1.72 

334379 CH22.FGENES379J1 1.72 

305854 AA862733 EST singleton (not in UnJQene) with exon hit 1.72 

313031 N34927 Hs.186566 ESTs 1.72 

329728 CH.14jj2gq60657B5 1.72 

312090 N57692 Hs.118064 ESTs 1.72 

323341 AL134875 Hs.1 92386 ESTs 1.72 

302077 AA310580 Hs. 132898 Homo sapiens chromosome 11; BAC CfT-HSP-31 1e8 (BC269730) 

containing the hFEN1 gene 1.71 

310766 AI971438 Hs.158824 ESTs 1.71 

311450 AI809985 H&203340 ESTs 1.71 

311792 AW238064 Hs253909 ESTs " 1.71 

321500 H71999 EST duster (not in UniGene) 1.71 

311948 T78791 H&241569 ESTs; Moderately smtr to ALU SUBFAMLY SQ WARNING ENTRY U!l [H^apiens] 1.71 

302270 R56151 EST duster (not in UniGene) with exon hit 1.71 

329089 CHJLhsgi|5868614 1.71 

322331 AF086467 EST duster (not in UniGene) 1.71 

318235 A1080361 Hs.134217 ESTs 1.71 

304561 AA489792 EST singleton (not in UniGene) with exon hit 1.71 

312681 AIQ28149 Hs.193124 pyruvate dehydrogenase kinase; isoenzyme 3 1.71 

310250 AI478629 Hs.158465 ESTs t 1.71 

338178 CH22_BA:AC005500.GENSCAN.219.6 1.71 

338910 CH22.rjJ32l10.GENSCAN.11-2 1.71 

321225 AL080073 Hs£51414 Homo sapiens mflNA; cDNA DKFZp564B1462 (from done DKFZp564B1462) 1.7 

322289 AA534550 Hs£39 rfcosomal protein S29 1.7 

319802 AI701489 H&202501 ESTs 1.7 

314022 AW452420 Hs248678 ESTs 1.7 

314937 AA5156Q2 Hs.152330 ESTs 1.7 
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65 



303344 AA255977 Hs250646 



Hs.146315 
HS204096 



315702 AA657501 
302385 AJ224172 
319699 R14537 
309506 AW137700 
330417 084424 
315296 AA876905 



323923 AA354146 

320303 AL079289 

302967 AI927068 

310695 AI472124 

307512 A1273815 



HS57697 
Hs.125286 



Hs.137154 
Hs.1 10853 
Hs.1 57757 
HS242463 



300580 AA761322 Hs220538 ESTs 

304398 AA262785 EST singleton (not in UniGene) with exon hh 

313421 AW339515 Hs.163700 ESTs 

309763 AW270182 EST singleton (not in UniGene) with exon hit 

322092 AF085833 EST duster (not In UniGene) 

315603 AA764768 Hs.121158 ESTs 
325031 T08597 EST duster (not in UniGene) 

327157 CH.01_hsgi|5866841 
314809 AJ741461 Hs.161904 ESTs 
320361 H67220 Hs. 146406 nitriiasel 
324721 AW4Q23Q2 Hs.43616 ESTs 

CH.07_hsgil5868246 

ESTs; Highly similar to ubtquffin-conjugatlng enzyme [Mjnuscutus] 
CH.08^hsgi|6456775 
ESTs 

lipophiBn B (uteroglobin family member); prostatehvfite 
EST duster (not in UniGene) 
EST singleton (not in UniGene) with exon hit 
hyaiuronan synthase 1 
ESTs 

CH.07_hsgi]5868485 
EST duster (not in UniGene) 

Homo sapiens mRNA M length insert cONA done EUROIMAGE 35971 
ESTs; Weakly similar to R10D12.12 [Celegans] 
ESTs 
keratin 8 

CH22_EM^C005500.GBJSCAN590-10 
Homo sapiens mRNA for alpha integrin binding protein 80; partial 
EST cluster (not in UniGene) with exon hit 
ESTs 
ESTs 
ESTs 

CH.14J\S gi(6381953 
EST duster (not in UniGene) 
EST cluster (not in UniGene) 
EST cluster (not in UniGene) • 
ESTs 
ESTs 

EST cluster (not in UniGene) 
CH.08_p2g!|5932415 
CHJLhs 955868502 
CH22LF6ENES.318J 
Hs.128457 ESTs 

EST duster (not in UniGene) 
Hs.1 7385 ESTs 

CH,12Jisgip866941 
315106 AW452184 Hs232100 ESTs 
326014 CH.16.hsg?5857160 
307130 AI185234 EST singleton (not In UniGene) with exon hit 

300943 AA524545 H&224630 ESTs 
319402 W21298 EST duster (not in UniGene) 

310889 AI457946 Hs.170437 ESTs; Weakly similar to hyperpolarization-actlvatQd; cyclic 

nudeotide-gated channel 2 [H^apiens] 
323371 All 351 18 EST duster (not in UniGene) 

55 335568 CH22L.FGENES.58U 
320654 AW263086 Hs.11 81 12 ESTs 
338983 CH22LDA59H18.GENSCAN3-1 
330002 CH.16_p2 $6623963 

315343 AW2G5477 Hs.179891 ESTs 
60 334487 CH22_JGENES.395_9 
312169 AI064824 Hs.193385 ESTs 
309688 AW204480 H&253414 EST 
309518 AW148928 HS348895 EST 

307965 A1421641 EST singleton (not In UniGene) with exon hit 

316787 AW369770 Hs.130351 ESTs 
300835 AA401858 H&224843 ESTs 

338763 CH22_EM'AC005500.GENSCAN517-16 
303327 AA232729 Hs.154302 ESTs 
313231 AW139993 Hs.1 63682 ESTs 
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331722 
301431 
318853 
323032 
317538 
325780 
321739 
319808 
313443 
331366 
316443 
322878 
330320 
329081 
334026 
317791 

331148 



AA195405 Hs.1 10347 



Hs^1062 

Hs.145946 

Hs.185980 



Hs.43149 
Hs.207407 



Z42977 

AW244073 

AW137772 

AL08O28O 

T58960 

AA249037 

AA424754 

AI797592 

AA081820 



AI801500 
ARJ86106 
K73816 



1.7 
1.7 
1.7 
1.7 
1.7 
1.7 
1.7 
1.7 
1.7 



1.69 



1.68 
1.68 
1.68 
1.68 
1.68 
1.68 
1.68 
1.68 
1J68 
1J68 
1.68 
1.68 
1.68 
1.68 
1.68 
1.68 
1.68 
1£7 
1.67 
1.67 
1.67 
1.67 
1.67 
1.67 
1.67 
1.67 
1.67 
1.67 
1.66 
1.66 
1.66 
1.66 
1.66 
1.66 
1.66 
1.66 

1.66 
1.66 
1.66 
1.66 
1.65 
1.65 
1.65 
1.65 
1.65 
1.65 
1.65 
1.65 
1j65 
1.65 
1.65 
1.65 
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314043 
304387 



337272 



334073 

319901 T77138 



AI802877 
AA827082 
AA236027 
AAD99732 

AA262768 

Z44266 

AW342028 

AW293704 

AW295409 

AI538438 

AA37B974 

AW074330 

AW402236 

AA354940 

AA8855Q2 



315336 
313329 
318086 
313835 



309372 
324157 



302490 



327469 
301918 
315664 
304405 



319250 
310608 
317348 
306513 
320607 
303710 
328291 
304236 
317683 
311960 
312834 



AA476777 
AI744068 
AA282572 
AI341594 
F11623 



AI348076 
AA989230 
AA086110 



W93278 
AI791700 
AW440133 
AIQ28309 



316035 
300492 
316532 
332048 
307113 
319127 
331155 
338220 
315763 
323571 
312240 



CH22J=GENES.327_2B 
H&8765 RNA heDcase-felated protein 

CH.19 hsg]|5867441 
H&210843 ESTs; Weakly similar to dJ1039K52lH.sapfens] 

EST duster (not in UniGene) 

EST singleton (not in UniGene) with exon hit 

EST cluster (not in UniGene) 

CH22 FGENES.660-1 
H&243901 K1AA1067 protein 

EST cluster (not in UniGene) 
Hs.256112 ESTs 
Hs.122658 ESTs 
Hs.137945 ESTs 
Hs.159087 ESTs 



300429 AW449679 
305169 AA663131 
316621 A1021S96 



AI744130 

AL031709 

A1307229 

AA496019 

AI183686 

N49476 

R87650 

AW515270 

AA984133 

R28626 

AA490934 

A1076101 

AI823847 

AA350125 

AW451654 

AA452310 

A1636253 

AI620617 



AJ610791 
AI378032 
AA437414 



313179 



317276 
312572 
311932 
302103 
308413 
310077 
337780 
327796 
308352 
324539 



1.65 
1.65 
1.65 
1.65 
1.65 
1.65 
1.65 
1.64 
1.64 
1.64 
1.64 
1.64 
1.64 
1.64 



Hs.130720 ESTs; WeaWy similar to CELLULAR NUCLEIC ACID BINDING PROTEIN (Rsapiens]JI.64 
EST singleton (not In UniGene) with exon hit 
EST duster (not in UniGene) 
Hs.145958 ESTS 
Hs.187032 ESTS 

CH22.FGENES.301J 
CH.02.hS 01)5867772 
EST cluster (not in UniGene) with exon hit 
Hs.160712 ESTs 

. EST singleton (not in UniGene) with exon hit 
Hs.157522 ESTs; Moderately similar to env protein [H^apiens] 

EST cluster (not in UniGene) 
Hs.196102 ESTs 
Hs.831 

ESt singleton (not h UniGene) with exon hit 
Ks. 188536 Homo sapiens done 24838 mRNA sequence 
Hs^50852 ESTs; Highly similar to ubiquffin hydrolyzing enzyme I [Rsapiens] 

CR07_hsgi|5868383 

EST singleton (not in UniGene) with exon hit 
Hs.127893 ESTs 
Hs.189690 ESTs 
Hs.1 14246 ESTs 

CH.11>gi|5866875 
Hs.169813 ESTs 

CH.Q2_hsgi|6381882 
Hs.156739 ESTs; Highly similar to XG GLYCOPROTEIN PRECURSOR [H^apiens] 

EST singleton (not in UniGene) with exon hit 
Hs.122138 ESTs 

CR14j)2gil6272129 
Hs.131201 ESTs 

multiple UniGene matches 
Hs.184304 ESTs 
Hs.201591 ESTs 

EST singtston (not in UniGene) with exon hit 
EST cluster (not in UniGene) 



1.63 
1.63 
1.63 
1.63 
1.63 
1.63 
1.63 
1.63 
1.63 
1.63 
1.63 
1.63 

1.63 
1.63 
1.63 
1.63 
1.63 
1.63 
1.63 
1.62 
1.62 
1.62 
1.62 
1.62 
1.62 
1.62 
1.62 
1.62 
1.62 
1.62 
1.62 
1.62 
1.62 
1.62 



337884 



Hs£3439 ESTs; Weakly similar to lit! ALU SUBFAMILY J WARNING ENTRY OB [Usapiens] 1.61 

CH22_EM:AC005500.GENSCAN246-9 1*1 

Hs.118342 ESTs 1 - 81 

Hs.153260 c-CbHnteracBng protein . 1*1 

Hs203669 ESTs 1*1 

EST singleton (not in UniGene) with exon hit 1 .61 

Hs.131704 ESTs 1*1 

CH.20_hsgI|8552462 1-61 

Hs.129986 ESTs 1*1 

Hs.187499 ESTs 1*1 

H&257482 ESTS 1*1 

Hs£6090 ESTs;WeaklysiinilartoT20B12.1IC^legans] 1.61 

Hs.196511 EST 1*1 

Hs.1 48565 ESTs 1*1 

CH22 EM:AC000097.GENSCAN.121-2 1.61 

CH.05jisgi|5867882 1*1 

EST singleton (not in UniGene) with exon hft 1*1 

HS.125892 ESTs 1*1 

EST duster (not in UniGene) wifo exon hft 1*1 

O*22_EM:AC005500.GENSCAN.54-2 1*1 
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303620 AA397546 Hs.1 19151 ESTs 1.61 

303481 AA336839 EST duster (not in UniGene) vrith exon hit 151 

314481 AA548589 Hs.105846 ESTs 1.61 

300327 AI908894 Hs245893 ESTs 1.6 

323473 AA262442 EST cluster (not in UniGene) 1.6 

326154 CH.17Jisgi|5867170 1j6 

331920 AA446885 Hs.99087 - ESTs; Moderately similar to ZINC FINGER PROTEIN 141 [H^apiens] 1.6 

323827 AW406878 EST cluster (not In UniGene) 1.6 

322452 W56710 EST cluster (not in UniGene) 1.6 

310597 AI739071 Hs. 15851 5 ESTs 1.6 

307671 AI368665 EST singleton (not in UniGene) with exon hEt 1.6 

322215 AF088005 EST cluster (not in UniGene) 1.6 

318420 AI139857 Hs.143837 ESTs 1.6 

332217 H98987 Hs.102383 EST 1.6 

324937 M79230 Hs.192398 ESTs 1.6 

320543 AF052176 Hs. 158529 Homo sapiens done 24457 mRNA sequence 1.6 

300674 AW467388 EST duster (not in UniGene) with exon hit 1.6 

315193 AI241331 Hs.131765 ESTs 1.6 

319713 R24204 EST duster (not in UniGene) 1.6 

301210 A1379982 Hs.158944 ESTs 1.6 

309365 AW072861 EST singleton (not In UniGene) with exon hit 1.6 

321403 AW451454 Hs^47568 adenylate kinase 3 1.6 

321908 AA376936 H&20998 ESTs 1.6 

303349 AA382661 EST cluster (not in UniGene) with exon hit 1.6 

324338 AL138357 Hs£47514 ESTs 1.6 

310599 AW300144 EST duster (not in UniGene) 15 

333193 CH22_FGENES.98J5 15 

336433 CH22_FGENES.825J2 1.6 

312097 AI352096 Hs.157169 ESTs 15 

311445 AW204237 Hs.192703 ESTs; WeaWy similar to li!l ALU SUBFAMILY J WARNING ENTRY UI! [H^apiens] 159 

317736 A1361722 Hs.192410 ESTs 159 

308147 AI4S8991 EST singleton (not in UniGene) with exon hit 159 

313489 AA017492 Hs.135655 ESTs 159 

316289 AA902488 Hs.122952 ESTs 159 

326983 CH21_hsgll5867657 159 

314781 AW205298 Hs.202372 ESTs 159 

328397 CH.07_hs gi|5868397 159 

331970 AA461084 Hs.1 87677 ESTs 159 

321744 N91419 Hs.12028 ESTs 159 

310509 AI292181 Hs.150036 ESTs 159 

315921 AI147545 Hs.114172 ESTs 159 

322049 AI928242 Hs.1 44383 ESTs 159 

301161 AA731518 EST duster (not in UniGene) with exon hft 159 

300548 AI026836 Hs.1 14689 ESTs 159 

319142 F07368 EST cluster (not in UniGene) 159 

313526 AW152263 Hs.249243 ESTs 159 

305937 AA883238 EST smgieton (not in UniGene) wflh exon hit 158 

330123 CH.19_p2gi|6671869 158 

327819 CR05Jsgij5867968 158 

318250 AI478814 Hs.1 34603 ESTs 158 

306760 AI034094 Hs.1 69476 tubu&K alpha; ubiquitous 158 

322358 AA220235 Hs246836 ESTs 158 

317866 AI690269 H&201345 ESTs 158 

320725 AA703319 Hs.120967 ESTs 156 

311332 AW292247 Hs.255052 ESTs 158 

334893 CH22JGENES.452.7 158 

318730 AA398215 EST duster (not in UniGene) 158 

315889 AW271639 Hs221744 ESTs 158 

303702 AW500748 Hs224961 ESTs; Weakly simitar to 73 kDA subunft of cleavage and polyadenylation 

specificity factor [H^apiens] 157 

315086 AI492660 Hs.170935 ESTs 157 

332514 AA156499 Hs.8454 protein kinase; cAMP-dependent; regulatory; type II; alpha 157 

335549 CH22J r GENES576J0 157 

329532 CH.10_p2gi|3983505 157 

323140 AA180467 EST cluster (not in UniGene) 157 

313166 AI801098 Hs.1 51 500 ESTs 157 

337896 CH22_EMiAC005500.GENSCAN.56-3 157 

330658 AA319514 Hs-211093 ESTs 157 

324585 AI823969 Hs.132678 ESTs 157 
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317151 


AW298195 


308818 


AI819700 


326547 




318833 


H06234 


320488 


R31386 


306929 
338083 


AI124514 


316868 
310937 


A1660898 
A1472880 


328638 
310074 


AI651039 


327058 




320076 


AI653733 


322345 


AF086529 


314731 


AJ74549B 


318687 


H49619 


303841 


AI934464 


302370 


AJ009849 


322571 


AF156271 


318050 


A1052093 


303388 


AL039604 


323758 


AA833858 


328369 
329415 




303915 
336794 


AW468839 


303074 


AA243481 


318807 


F08434 


334287 




311928 


AW024798 


304592 


AA505833 


300785 


AA682913 


304921 


AA603092 


324605 


AW502851 


324473 


AW501163 


300566 


H86709 


314165 


AA761265 


302868 


M157392 


314034 


AI299137 


325389 




331849 


AA417078 


320536 


AA331732 


303347 


AA258033 


315769 


AA744875 


317031 


AA973297 


300203 


AI827065 


304037 


T26438 


322613 


AW1 60507 


317987 


AW138174 


322313 


AF086386 


323392 


AW411383 


325303 




312701 


AI457663 


304787 


AA582678 


305849 


AA861571 


314557 


AM01367 


316507 


AI381515 


315023 


AA533505 


314920 


AA513406 


323097 


Z44354 


325043 


W27919 


307892 


AI376086 


324573 


AA491600 


313092 


A1923673 


324696 


AA641092 


303019 


AFD98363 


317158 


AI459140 


309536 


AW151933 


301568 


AI146423 



Hs.255735 ESTs 
Hs.208231 EST 

CH.19_hsg?5867307 
H&24888 ESTs 

EST duster (not In UniGene) 

EST singleton (not In UniGene) with exon hit 

CH22L.EMAC005500.GENSCAN.174-1 
Hs.195602 ESTs 
Hs.170480 ESTs 

CH.O7_hsgij6004473 
Hs.148559 ESTs 

CH-21Jtsgi[6531965 
H&204079 ESTs 

EST cluster (not in UniGene) 
H&204579 ESTS 
Hs.127301 ESTs 

EST duster (not in UniGene) wSh exon hit 
Hs. 199297 Homo sapiens GNAS1 gene encoding NESP55 

EST duster (not in UniGene) 
H&133132 ESTs 

EST duster (not in UniGene) with exon hft 

EST duster (not in UniGene) 

CH.07 hsgi|5868388 

CH.YJisgi|5B68874 
KsJ257767 EST 

CH22 EM:AO0Q5500.GENSCAN528-1 
Hs.127320 ESTsfWeaxty similar to K1AA0346 [Haptens] 

EST duster (not in UniGene) 

CH22_FGENES.369_17 
H&233374 ESTs 
Hs.162017 EST 

H&247179 ESTs;WeaWysimBaTtoKlAA0319IH^iens] 

EST singleton (not in UniGene) with exon hit 
HS249978 ESTs 

EST duster (not in UniGene) 
H&21371 son of seventess (DrosophBa) homoiog 1 
H&221281 ESTs 

EST duster (not in UniGene) with exon hit 
Hs.154214 ESTs 

Ch\1*Lhsgi|5866921 
Hs.193767 ESTs 
Hs.137224 ESTs 

EST duster (not in UniGene) with exon hit 
Hs.189413 ESTs 
Hs.126101 ESTs 
H&224877 ESTs 

EST singleton (not in UniGene) with exon rift 

EST duster (not in UniGene) 
Hs.130651 ESTs 

EST cluster (not in UniGene) 
H&169688 ESTs 

CH.11_hsgjl58S69D8 
Hs.128127 ESTs 

EST singleton (not in UniGene) with exon hit 

EST singleton (not in UniGene) with exon hit 
Hs.128647 ESTs 
Hs.158381 ESTs 
Hs.1 85844 ESTs 
Hs.152307 ESTs 

Hs.180950 guanine nucleotide binding protein (G protein); q polypeptide 
Hs.32944 Inositol poiyphosphat&*4<phosphatase; type I; 107kD 
Hs.158799 EST 
Hs.161942 ESTs 
H&212827 ESTs 
H&257339 ESTs 

EST duster (not in UniGene) with exon hit 
Hs.129109 ESTs 

EST singleton (not in UniGene) with exon hit 
Hs.146709 ESTs 



157 
157 
157 
157 
157 
157 
157 
157 
157 
157 
156 
156 
156 
156 
156 
156 
156 
156 
156 
156 
156 
156 
156 
156 
156 
156 
156 
156 
156 
155 
155 
155 
155 
155 
155 
155 
155 
155 
155 
155 
155 
155 
155 
155 
155 
155 
155 
154 
154 
154 
154 
154 
154 
154 
154 
154 
154 
154 
154 
154 
154 
154 
154 
1.54 
154 
154 
154 
154 
153 
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315674 AA651923 Hs.191850 ESTs 153 

321B61 N79341 EST duster (not In UnJGene) 153 

310890 AI184510 Hs.143728 ESTs 153 

330036 CH.17_p2gi|6042048 153 

316907 AA843868 Hs.1 90567 ESTs 153 

312299 AA97Z712 Hs.174818 ESTs 153 

331128 R51361 Hs.23423 ESTs 153 

305177 AA663591 EST singleton (not in UniGene) with exon hit 153 

337685 CH2?„EMJ«X)00097.GENSCAN.77-1 153 

335290 CH22LFGENES527J 153 

308896 AIB58667 EST singleton (not in UniGene) with exon hit 153 

307944 AI418246 EST singleton (not in UniGene) with exon hit 153 

300857 AW340374 Hs.121033 neural precursor eel) expressed; davelopmantally down-regulated 1 153 

335320 CH22_FQENES534_7 153 

329841 CH.14jj2gi|6672062 153 

317916 AI565071 Hs.159983 ESTs 153 

332801 CH22 FGENES56J2 153 

305413 AA724659 EST singleton (not in UniGene) with exon hit 153 

316707 AJ016387 Hs.184406 ESTs 153 

313693 AW469180 Hs.170651 ESTs 153 

316101 AA922236 H&221037 ESTs 153 

320796 AF038966 Hs.1 84543 secretory carrier membrane protein 1 153 

307451 A1248615 EST singleton (not in UniGene) with exon hit 153 

323648 AI679968 Hs.152060 ESTs 153 

331482 N27515 Hs.40296 ESTs 153 

318059 AIQ23175 Hs.167022 ESTs 153 

325958 CH.16 hsgl|5867142 153 

315736 AA664265 Hs 230213 ESTs 153 

314740 AW015667 Hs.1 19427 ESTs 152 

314117 AA224368 Hs.185164 ESTs 152 

301646 AA313954 EST cluster (not in UniGene) with exon hit 152 

338752 CH22.EM^C005500.GENSCAN513-10 152 

309314 AW009312 EST singleton (not in UniGene) with exon hit 152 

301445 AI208364 Hs.128233 ESTs; Weakly similar to REGULATOR OF CHROMOSOME 

CONDENSATION [H^apiens] 152 

308501 A1685263 Hs*01150 EST 152 

312330 AA6353Q5 Hs.121574 ESTs 152 

318040 AI018150 Hs.148781 ESTs 152 

336205 CH22_FGENES.719J0 152 

325701 CH.14_hsgi|5867028 152 

315009 AW1B9460 Hs.208358 ESTs 152 
303121 AW407585 Hs£7769 ESTs; Weakly simBar to mCAC [Mmuscuius] 152 
309271 AI986221 EST singleton (not in UniGene) with exon hit 152 
328385 CH.07 hsgi|5868395 152 
307700 A1318545 EST singleton (not in UniGene) with exon h& 152 
314591 AW103292 H&245328 ESTs 152 
304484 AA432067 Hs258373 ESTs 152 
304382 AA232873 EST singleton (not in UniGene) with exon hit 152 
304232 W52574 EST singleton (not in UniGene) wSh exon hit 152 
309853 AW298169 Hs57553 tousied-iike kinase 2 152 
312504 AW207346 Hs.143202 ESTs 152 
313134 N63406 H&258697 ESTs 152 
330391 AFO15950 Hs.1 15256 tebmerase reverse transcriptase 152 
314342 A1873046 H&258775 ESTs 151 
305977 AAB87293 EST singteton (not in UniGene) with exon hit 151 
301165 N85789 H&224155 ESTs; WeaMy similar to PTER1N-4-ALPHA-CARBINOLAM1NE 

DEHYDRATASE [H.sapiens] 151 

300613 AI932294 H&249604 ESTs; Weakly similar to B-CELL LYMPHOMA 6 PROTEIN (H^apiens] 151 

324124 AI554212 Hs.185664 ESTs; Weakly similar to SERINE/THREONINE-PROTEIN KINASE NRK2 [H.sapiens] 151 

308037 AI458207 Hs.174181 ESTs 151 

323909 AL043148 Hs. 186257 ESTs 151 

315464 AW139500 Hs.116135 ESTs 151 

306700 AI022056 EST singleton (not in UniGene) with exon hit 151 

337976 CH22_EMAC005500.GENSCAN.107-1 151 

306855 A1083982 EST singleton (not in UniGene) with exon hit 151 

311045 A1569399 Hs.174746 ESTs 151 

315010 AA531082 H&240049 ESTs 151 
310205 AW025248 Hs202445 ESTs 151 
310759 AW135924 Hs224883 ESTs 151 
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TABLE 14A shows the accession numbers for those primekeys lacking unigenelD's for 
Table 14. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from 
Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity 
using Clustering and Alignment Tools (DoubleTwist, Oakland California), The Genbank 
accession numbers for sequences comprising each cluster are listed in the "Accession" 
column. 



Pkey: Unique Eos probeset Identifier number 

CAT number Gene duster number 

Accession: Genbank accession numbers 



Pkey CAT number Accession 



322064 234514.1 
321409 197898.1 

322092 4667BJ 
321452 212379_2 
313603 199797 J 
320856 36098 J 



322139 
321500 
313733 
322215 
322235 
321632 
313833 
322310 
322313 



46806J 



322370 
321739 
321781 
314570 
300129 
322452 
321861 
323140 



321914 
322571 
322574 
314753 
300370 



441212.1 

47002.1 

47O70.1 

286374J 

120893.1 

47376J 

47386.1 

47434.1 

47467.1 

47537.1 

47545.1 

187612J 

43998.1 

1511778.1 

280469.1 

635249.1 

497108.2 

1651920.1 

159551.1 

38916.1 

85114.1 

22297.1 

39412.1 

311451.1 

3910J 



322601 577912.1 
322613 34330.1 



316055 409389.1 
323316 981458.1 
300492 25768.1 



BE261397 Z78343 BE176419 AA383657 N90640 AA334052 AW955761 BE536232 AA374087 AA584776 

N71838 AA282003 T54072 AA761419 H92966 AI631371 A1095435 AI690247 R99331 AW964110 AA975590 AA346128 

H94196C03864 

AF085833 R69689 AW341677 AA923375 BE327566 AW630415 R69601 AW615339 
AW962489 H64300 AA329527 
AA284333 AW468119 AA284334 AA810992 

AB040928 T94673 A1289313 A1536039 Z44366 BE141499 D60116 D61488 059945 AM 19503 R28090 R72986 H03255 
A11891 12 AI912312 AW51 1018 AI401349 AW470144 C14624 AI335797 Z40300 AI014456 D60269 D60115 T16722 AI370673 
D60270 

H53744AF075088 H53797 
BE004271 AI248023 AI022157 K71999 
AA766346 AA809877 AA6361 16 AW469598 AW977404 
AFD88005 N51816 N51731 

AF086106 AM 93589 AW665594 N71795 AA722627 AW665373 AI300251 

AW812795AM19617H87827AW299775AVV382168AW382133BE171659 AW392392 BE171641 AA541393 

AA766825 AA81 1 180 AA085906 AI762946 AW977820 

AF086376 W77804 W72689 AAB37735 

AF086386W77947W7Z708 

ATO86431 AA886756 A1557237 

AF086467W81444 W81445 

W95208 AF086529 AI912190 AW294159 AW58747 W94782 

AF086538 W95969 A1631911 W95835 

AA330095W25112AA248401 

AL080280 T73124 KQ2689 AL08Q281 

D78667D78871 C18258 

AA904776 AA405696 AA405S62 

AW028820AI219068 

AI147202W56755 W56710 

N79341 N99082 N47551 

AA1 80467 AA449184 AA464831 AA505048 

T55958T57205AF147346 

AA011603N5B504N58611 

NM-016102 AF156271 AA7B186B AW152318 AW77O403 AA909463 AA482996 AA758672 
AF156543 AA639797 A1675267 AJ825497 AI823355 
AA463262 AA463615 AW 160405 AW407583 

AW136181 AA581939AK001221 AA694538 AA424043 Al 01 6272 AA098960 AA884473 AI356180 BE391633 AA437086 
AI277866 AA098827 AA992680 BE172624 AA424101 AA320776 AW962967 N77431 AW858960 AW858897 T85649 
AA357743A1827817AI905672 

A1082395 W92924 BE048524 AW005302 AI084474 A1369330 AI827710 AW135506 AW298694 

AW160507 NM.013367 AF191338 AA384939 AI445790 AA730309 BE397003 BE267753 AI979163 N50386 AW583671 

AW583608 BE074466 BE074479 BE074471 AW976283 AA604393 AW162122 W73648 AI823475 N75898 W73713 

AW470099 AW513236 AW025055 AW6131 15 AI923379 W58081 AW664525 AW196795 AI143619 A1565152 AA025406 

AA505846 AI685494 AA829964 N59156 N59163 R15442 AA826919 A1610221 AI200120 AA603279 AW150822 AI189513 

AJ807122 AI016368 A1335868 AW583389 A! 193892 AI956157 A1628879 AW591589 AW583446 AI955406 AW148396 

A1340255 AI867942 AA748525 AA876991 Z38516 AI8740Q2 A1869474 N63100 AA429094 AA082443 

AW105663 AA693880 AW517398 A1758507 BE220851 AW978538 AA831489 

BE219300 BE327455 AL134620 R36741 R17996 

AL031709 AI249061 AA907658 A1420444 
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308362 782618J 
307783 697809 1 
301161 427238J 
324094 270098J 

309023 4737J 



316141 423880_2 AW303457 AA972713 AA724265 

323371 117336.2 N45114 N5146S BE087338 AI083551 AL1351 18 BE395609 

307700 30923.11 BE280998 BE254670 BE294951 BE5S4979 AW405364 AA059256 AA1 28837 AI559687 BE281405 AW4 10850 BE041153 

A1254811 AW301340 AI813335 AW301411 A1609469 A1B11607 A1611616 AK377623 A1335509 A1613544 BE043165 AB71663 
AI340452 AI612066 AW072890 A1254558 AI349884 AI370095 A1613383 AI61 1946 Al 61 3353 AI307414 AI318229 A1612685 
AW305327 AW268924 A1370063 A1349292 BE049068 A1369098 AW274098 Ai344845 AW075187 A1053401 AI345220 
BE138515 A1613386 A1583302 AW301 955 AJ349681 AI307432 AI054168 AI223913 AI612081 AI348942 AI334539 AI309366 
A1370098 AI252360 AW066316 AW26891 1 AW073482 AI379802 AI224284 Al 0536 61 A1334538 AI309369 A1309688 AI310023 
AI492709 AI335418 AI053999 AI366989 AW073478 AI247058 A1249584 AI305875 AI308585 AW071272 AI271487 AI340719 
AI366995 A223673 AW271066 AI611938 AW071296 A1270798 AE54385 AI251393 AG52562 AW268236 AI254858 
AW071317 AI3091Q2 AI609897 AW268971 AI583267 AI792484 AW075168 BE138443 AI254126 AI309822 AI310872 
AI61 1953 AE251054 AW27665B AI335405 AW075039 AI31 1768 AI612028 AW271895 AI612005 AI312240 AW271082 
AI371642 AI334879 AI310194 A1310772 AI345419 A1334675 A1223914 AI284707 AI284813 AI349140 AI254853 AB13094 
AI310170 AJ309499 AI312476 AI376484 AI335467 AI340802 AI309815 AI310168 AI61 1446 AI345824 BE327775 AJ318545 
F17185AWB14950 
AW998989AI613519 
AI347274AWB44G24 
AA731518AA765714 

BE395109 AW663898 AW237041 AJ492154 BE046906 AB51285 AI983290 AW002590 AC01040 F32424 AA992272 
AW271836 

AF180681 NM.015313 AA229509 AA225792 AA216413 AI888045 BE005205 AB002380 155518 BE276097 AW380669 
BE142836 AW370976 AA479384 R9S425 AI680999 AA595138 H54582 AKJ22709 T§5440 AI041769 AA861 144 AW392Q28 
AA479287 AA824634 A1638446 H54691 R9B382 AA770352 Al 640467 AW293491 AA778138 R28298 AA970562 C15590 
R84455AA020769AL036394 H80566BE548861 AA301207 AW959414 AI284253 AA043173 W52429 BB44571 R24852 
Z42603 F13120 R24340 R24326 T75305 H701 10 N56255 AA334210 F1 1453 AW947285 H80345 AA298992 AW380931 
AI267175 Z45421 AW380981 W861 13 AA663590 AA1 67577 BE566760 BE1691 66 AA449904 AA45920S N31 126 W03564 
N31208 AW993277 N44765 AW605275 D61449 W68572 AA258190 D60496 AW992964 U46277 H04097 AA370360 
AW95721 1 AA159775 AI831243 H83367 H21671 D61077 AW392712 N21 1 12 H98522 N45298 N83629 AI393509 AW022043 
AA744886 AI580482 AA7232B6 AI422244 AI423984 D62804 AI088349 AA587890 AI144172 N33275 BE074397 H03399 
D62578 AI056639 AI82991 8 AA579584 A1089460 AI350124 W68573 AI580828 H98897 AI570468 H8371 5 W8B1 1 4 AA923123 
D57446 AA043174 AW337721 AI266551 All 40017 AW022358 D79855 D79650 D79393 D60495 AA788666 AA693443 
AW516977 W60139 AI628156 AW473223 AI608892 AA159670 AW440366 A1421529 T50751 AI174374 AA912234 AA724248 
AW780400 AA907218 H80514 057452 AAB63419 AA552618 029614 R44556 T16452 R44935 241132 D29188 K69692 
AI250176 A1078860 AA370359 AW183108 H74200 AA258183 F10723 C00323 R86148 AA850570 A W1 30073 AL079946 
AA410327 AA532614 AA234500 AI151507 AMI 0288 AW969839 AA483232 AI383200 AA236540 AI807672 H73441 
AA262442 AA766862 AA262443 

AA827650 AA827652 AW629526 BE044585 AW974451 AA761439 AA648505 AA765803 
AAQ81820AA082191 AA079811 
AA807558 AA8271 17 AW629567 

NM_01 6603 AF251 038 Al 124624 AA776579 AW298470 AI304868 AW082724 AJ348442 BE218336 N20641 AI018013 
AW858832 AW978157 AA815187 AA932948 AF157316 A1444958 W00848 W02935 AI434933 N26335 AA428681 AW371059 
A1651612AW134937AW968911 AA488815 AL157523 W48766 AW936954 AW936941 AW579205 AW936888 AW936889 
N74541 AW936953 AW578421 AW604352 AW367088 AW849258 AW849453 AW371606 AI554921 W49785 H99814 
AA805957 AA904606 AW206696 BE1 69229 AA333951 AA190704 AW936944 AA463219 AA430306 AW805704 N46503 
BE222307 AI638612 BE550045 AI805304 AI690987 AA776841 H12690AW1 83731 AI380760 AI636261 AA812641 
AW592656 AI886132 AA843424 K99220 AW084996 AW126879 A1800871 AA810135 AA191524 AI150076 AI474530 
AA748461 N29013 AA746372 N59606 

N75450 AA877636 AW137945 W05248 AA514763 AW972399 AI758397 AW 195051 
AW402931 BE393099 
AL036947 T93676 T85475 

AA641735 AA281881 AA861209 AA934758 AA835887 AA641795 AA748822 AW295703 
AW467388AA826954 

AF168711 AA099732 BE019157 AI380212 BE298159 AA249097 AA3051 12 AW962349 AW962353 AW401801 BE292961 
AI439469 AA442919 A1630537 AA724473 AI814288 AW966815 AI376871 AI860202 AI683132 AA099733 AW6Z7633 
AI754022 BE206347 AW183349 A1378222 BE1 78926 AI4732B2 W52944 AW752469 AW966817 
AA301270 AA301379 AA301366 

R85652 AA114024 AA296219 AA375304 AW963796 AW885952 AW020969 AA114025 AI804930 BE350971 AI765355 
AW317067 AW974763 H85930 AW172600 A1310231 AW612019 D62908 D62664 AA652738 A1674617 AI494064 AW1S8666 
AI147620 AI147629 AWS1 1793 AI668922 AI971005 A1884742 AA174171 

AK001701 AA134337 AA356202 BE163251 AW875175 AW875181 AW875177 BE163389 AK000741 AA247755 AA120819 
AW868040 AA3091 16 AW962348 AA471267 AW996843 AK001452 BE005344 BE617899 AA186588 AA120820 AW36331 1 
AA648105 N71529 BE168417 AW673900 AI858160 AA1 34338 AA659697 N22162 AI335437 A!31 1237 AI343171 AI336661 
AW268074 AW274348 AA935005 AW576295 AW262626 AW593153 AA730055 AA662650 AA782687 AW894855 AI933533 
AW193002 AW899448 AW890142 AW812670 AA085664 AA334191 BE178085 BE180553 AA389680 AA984772 AA442527 
W26560 BE384359 AA847210 AW304931 AI669606 AA085613 AW197240 AI632828 AA581646 AW129348 AI017643 
AW089030 D20893 AI382955 AI557148 AW499979 
324231 975669 J W60827 AL079968 AL047234 
324248 977901 J AW504918 N55410AL118584AW839266 

323691 221757.1 AA317561 A1793000 AW2351 1 1 AJ793178 AA767397 A12631 13 AA719462 



323473 193878J 
315639 392767.1 
117013J 
.1 



40 301256 16720.1 



45 



300611 337193.1 
324157 247225.2 
323509 987739.1 
323514 197787.1 
300674 466093.1 
.1 



323591 209807.1 
10774.1 



60 322957 29014J 
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315858 406384 1 
301431 569736 1 
324303 233842J 
324330 300543 1 
300815 41537J 
324349 1154015 1 
323715 225129 1 
309314 23273_-3 
323758 229624.1 
309375 127J 



325031 266373.2 
325045 1534945J 
324473 38795 J 
323827 235506 .1 
302270 17341 92 J 
301618 10967.5 
301646 42154J 



10 



15 



20 



25 



316774 463723J 
309577 6483_6 
30 302345 29533 J 

302358 1064753.1 
3246H 215437.1 
324661 385257J 
35 324685 41003.1 



324692 351987.1 
316893 473541.1 
303027 21786.1 

324715 290035_2 

324771 385085.1 
324783 389615.1 
303114 37417.1 
303124 2H12J 



302552 82290.1 
301918 316229.1 
303232 20474.1 



302696 33570.1 

302697 43219.1 
309917 57485.2 
303347 192210.1 
303349 193138.1 
310599 690880.1 



AA737345 AA682286 AI799378 

R05385AI061251 

AL1 18754 AA333202 H38001 

AA884766 AW974271 AA592975 AA447312 

BE152396 BE152395 AA267515 BE001834 AA286678 AW406477 

AW501470 AW502931 AW499500 

AA322155 AA326396 AA326538 

AW009312 

AA833858 AW978090 AA327679 AA810436 

AF286598 AW075342 AB028994AL043713AW378914AA340650N57166 AW956914 R17961 AA336481 BE393734 
AW977867 AW294638 AA927857 AA961627 AW303969 AW89441 6 AA8121 19 AA912758 AA424355 AA490582 W30941 
AM76693 AA131029 AA127777 AL043714 AA496984 T51 1 17 AA127722 AA594012 AW92676 N76483 AW1 19061 BE464926 
AW30341 9 AJ972370 AI768172 AI826550 AI435432 AI379516 AA778421 A1276089 AA424521 N59361 AA723153 AA723176 
A1867487 AA090677 A1827221 AI351 027 WQ2732 AI810729 AA142848 AI0821 10 N59379 N29744 AI283747 AI148665 
AW779845 AI382967 F34319 AI369934 AI282438 AW183449 AA863467 AA813469 AI092645 AI870701 AA863119 
T65475 R07576 T17017 F08143 Z43546 
T08845 Z43538 F06691 

BE560824BE513941AW2389OTAA580852AV^1176BE241846AV^ 
AW406878 AW966560AW966151 AW966496 AA336174 AA335376AA335537 
R56151W91936 
T52761 T52760 

AJ277841 AI630669 AI804370Z41939 AW751251 AA299456 Z44739 AW860471 Zg>158AW1 05391 H56997 W84688 
AM91201 W84636 AA706815 AI131055 AA483636 AI005O75 AW340034 AI332372 AW1 18195 AI338932 AI191968 
AA693932 AI189982 A1 193225 AA884163 AA594562 W37747 AA249754 AA746131 AI91 6540 A1832188 AW946555 
AA833838 Z40564 AA861563 F01447 AAB87937 AI933559 AW973250 AA566018 AA313954 
AA354146 AI184230 AA643525 
AA492588 AA492498 AA492571 
AA814859 AA8 14857 A15B2623 
AW902251 AW168753 

X12830 NM.000565 AW503691 X58298 S72848 AA193347 AW503481 AW177946 AW178192 AW1781 88 AA265233 
AA410577 AA193465 AW177939 AW365459 BE221693 
AW207734 D60164 D81150 D81078 061356 AW996804 
AW503101 AA3091B4 N56323R7Q998 
AW504161 AW503601 AW505509 

AF226667 AA207032 AA100804 AA121287 AA488316 AI808218 AW419048 A»1 1097 AW132123 AA502311 AW089948 
AA100952 AI075431 AW083432 AI990554 BE466029 F28643 AF086422 W79581 AW439007 F37179 W79780 AW439035 
AA731381 AW750380 AA25101 2 AW589846 AA730238 AA329792 AW087255 AA220982 AA082469 AA877260 AA232380 
BE298910 

AA557952 AA677593 AA618150 
AW979189 AA837332 AA856946 AA876935 

AF111178 NM_005708 AF105267 AW590040 A1979280 AA001322 BE146329 AA702430 AA7Q2429 AA694221 A1206348 
AI206285 AW770197 AA923032 AI379586 AA701 165 AW594643 AA001909 AWD02368 

A1739168 AA426249 Al 19 9636 AW505198 AW977291 AA824583 AA883419 AA724079 A! 01 5524 AI377728 AW293682 
A1928140 AA731438 A1092404 A1085630 AA731340 
AA631739 AA768584 AW134477 
AA640770 AI683112 AA913009 
AF090948 AI064898 AI1 1 1 182 

AB018257 BE148S40 AA081832 AK001915 AF150217 AF161350 AS219174 AW074664 D60040 AA346065 H28750 
AW151783 BE613360 BE612626 BE5G2031 AW183790 AA992580 AA505815 AI310432 AI678015 AW592679 AA979181 
AA806708 AI74411 0 H24681 CI 6064 D62900 A1285033 AA346064 AJ865123 AW467798 BE221231 AL120676 N89877 
AI928370 AI35B387 AA748486 AV647478 AV647460 AA312313 AI279340 AW505099 
AA005122 H49792 
AA476777T86049 

AA437414 AA131479 AA086182 AB037775 AW161063 AW514393 AA332331 AW136197 BE1 50789 AA425533 AA249605 
N88308 AI016201 BE004662 AA291027 R57567 AA424277 AA476391 W07532 T97036 AA218898 AW162629 R57770 
W01278 W902O4 W90156 AL1 19197 R84513 AA280103 AA334994 AW965504 AA460868 AA447470 AW1 38594 W38898 
W90028 A1078353 W9007B AA699696 N35523 AA704225 AA035059 AW134892 AA1 15140 Al 142854 H90084 AA826342 
AA460694 N46339 AA425344 N56953 AA035569 AI761083 A1658696 AI524818 AI338965 AW069249 AW299871 BE464061 
AJ 189720 AW340682 AJ423380 AI275122 H17532 N80735 AA826343 Al 039 694 BE328398 All 82947 AW271286 A1623122 
AI922802 AW293087 N22141 AA730657 AW316610 N26473 F06663 Z43610 H14783 R59761 H1 1540 AI265915 AI681773 
A1091748 BE220636 AW841861 AI702181 AM68447 AA907544 AI273941 AW244034 R37769 AA446663 T96929 BE045884 
AA476341 H89994 H29043 AW05121 1 N49522 AA306977 

AK000738 AA347452 AW961713 H70832 AI750643 AA362887 AW9555B8 W44974 AA279599 AW298762 AA452666 

AA443355 AI337273 AA446931 AI752977AA661554 W42674 AI292172 R41163 AA621381 AI244157 

AJ001409AJ001410 

AW34O014 AW866993 AV651649 

AA258033AA459485 

AA382661 AW958642 AA259088 

AW300144 AI33B491 AI798381 BE220076 
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319127 1653640J 
303480 232749.1 
15 303481 31534J 
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303494 238389.1 
319142 164820J 

L1 



303388 969232 J AL039604 AL039497 

302761 45074.1 AW250553 L07876 236843 R30693AI1 80097 AW96531 7 

318455 606341.1 AI148763 AI903763 A1903753 A1903762 A1903800 AI903801 

317850 363835.1 AI681545 A1951714 A1570397 AW87358B AAB36396 AB59986 AM99790 AA773477 AI951615 T07547 AW304709 AF114041 

BE1 76629 Z445B0 T30422 T32690 AW953065 H 10602 
303431 32082.1 NM_000539 AA019013 AA01 9367 AA056 154 H38735 AAQ57003 AA021051 H381 02 AA0 15774 AA059291 AA019439 H84843 

HB3375 AA019914 AA017288 R84449 W26519 H38258 AA018736 H34147 AA0 18577 AA059353 U49742 H38767 AA318341 
AA317553 H86646 K91989 AA317398 AA317378 W29024 W23034 T27877 AW950059 AA017195 R84262 AA057177 
H89941 AA01 9904 H84662 AA015775 AA019368 AA020976 H3790O C20733 H38682 H85197 AA01 8578 AA017252 
AA019440 AA059059 H38651 HB4148 AA018560 W25754 C20752 AA317915 AW9521 15 AA317369 AA01 9845 R85402 
AA019492 AA017196 AA056093 AA056094 AA058836 AA056155 W25957 W23Q27 AA056159 W23043 W21890 W28951 
AA317978W26459AA317285 
N49476Z45911R21061 
AA331906AA332484 

AK001952 AA336839 AW249271 BE247287 AF 182002 BE613472 AW9S2573 AA332235 AW849937 AW849814 H49893 
AM77148 AW968944 AF182003 AW007897 BE246145 W76100AI480141 AW410205AA609339AI209111 AW000979 
AA33Q280 AW961554 W72865 H49894 AA514317 AA620407 AA504522 AW472833 AA716609 AW129282 AA347351 
AA628378 AW589860 AI636696 AA464632 AA464533 AW8741 89 AA757076 AA479654 AW517910 AW292357 AW872638 
AW262283 AI910666 AW5 13749 AVV233771 AA215797 BE387073 

BE143533 AW850432 AK000042 AA333666 AA385314 AW966616 AW793068 AW793414 AA361 103 AW390841 AA040095 
AW385058 AW799162 AI3831 15 AI990745 AI6S3703 BE503893 AW150758 AI949919 AW190450 AW512348 A1625970 
AW501057 N52954 AI281 378 AI40171 0 AI648409 AW002659 AI687639 AI093943 R33960 AA040062 A1926267 AB40425 
A1520911 AI093428R52943 

303488 36085.1 AI040372 AB040915 W40569 BE1 58910 BE1 58914 D63226 AW025860 AW583088 AA334307 AA210942 AW753212 

AW805322 AA3S2635 BE15891 1 AW891225 AW994862 AA805451 R28541 AA229347 N48266 AI377788 R28682 R38122 
AA811941 AI240742 AI632001 T99965 W01976 AW891205 AW891177T97433C15571 AA346850 AA504293 W07500 
AI694503 AA489216 AA327725 AW959917 AA694146 N68514 A1076285 AW016246 T077B3 AA6424Q0 AA716133 AAB05332 
R00312 AA705021 AW498605 AW891723 AW891 906 AAB08025 N29039 N74897W60393AA810184A1627460AW057516 
AA807436 AA760966 AI359295 N78642 N20662 AA830300 W81705 AA832258 AW891718 AI811796 AW515523 Z41735 
AA449978 AW891714 AI684539 AW891896 AW071701 AI890916 A1924994 AI039743 AA888524 AA244214 AI015736 
AI270105AI865077 

F30712 F35665 AW263888 AI904014 AI904018 AA336927 AA336502 
H08370 Z46168 F07366 AA193168 AA193138 

AK000290 AI476034 AA465309 BE148761 AW303607 AW958665 AW469635 Al 8 19365 A1243857 AW469326 AA1571 10 
AA278626 AA496257 AA306656 F29732 AA831 859 AA312210 AA564476 AA579065 AA769522 AA740386 AI205635 
AA491643 AA81O400 AA417708 AI567332 AA157392 N53817 AA374229 
R68545 T271 19 R25687 AW750672 
H13364T2713S R61679 AA746905 
H77679 

AB038995 NM.016530 AK001 1 11 AA465635 AW968716 U66624 AA885459 AA703019 A1040266 AJ 01 8689 AI692886 
A1125372 AI376796 AI192040 N58161 AL1 33607 AW503673 AW505479 AA362265 AJ404671 
F11623H17552AA347728 

BE311816 AK00091 6 AW868037 AW868039 AF228527 AI752482 AW86B041 AA077049 AI201537 W55S73 AA20601 9 
AA077918 AW968729 AI978828 AW139620 A1093053 AW204025 A [41 8805 AA598926 AA586345 AA045669 BE314455 
AA045668 

W01 166 AW996900 BE184300 Z44887 T34535 R51495 AW886575 AA295490 AA295162 AA295163 AW937125 156951 
BE386106W52674 

AW5001 06 BE241915 AW503971 NM.016542 AB040057 AA313812 AK000556 W16504 A1822088 AA259107 AA1 9 131 9 
BE085957 AA309584 BE122687 AW952435 T84469 BE088194 BE088132 AA328562 BE092674 AA263102 T3S634 
AW992380 R79391 R24392 H03060 AW675066 AI299952 AW020325 D25953 N75199 AA361425 AW612302 AW236333 
AW673897 AW953686 N22323 AA649166 A1377099 H03Q61 AI660072 AW276405 AA809779 A1803430 AW297484 
AW510384 AA814816 AA371522 D83035 AA953567 R79392 R24282 AA876831 AW297542 AI699023 AA992652 AIQ41 436 
318704 799152.1 AI631602AW589676 Z28684Z24981 
318730 275 116 J Z32887 BE349923 AA398215 AA399231 
303714 1155758.1 AW501336 AW501337 
304387 183612.1 AA236027 BE003275 

304398 10169.1 AA195509 BE394661 AV660757 AA489161 BE165972 AW503705 AA262785 AF123320 Z78357 NM.014171 AF161488 

AA248971 BE568575 AA461410 AA165108 AI637731 H75454 AA372934 AW339334 BE568754 BE564697 BB67299 
AI681606 BE537269 AW197204 AA290890 A! 169393 AW292463 AW470227 F27399 AW61 1942 BE566888 AW301701 
AI675761 A1628429 AA164711 AI797753AI 656879 AI91 2690 AI675277 AI695099 AI094095 AW01 4158 BE091059 AI201748 
AW236961 AI038003 AI083606 AA401 606 AI079405 AI073516 AI655537 AA401475 AI814532 AI079862 AI093789 AI422084 
AI216476 AI392760 AA926998 AA781782 Z25198 A1086377 A1 185511 Al 185539 Z28843 A1223792 A1379563 AA706253 
AI433798 AI921885 H75455 AW025269 AI224100 A1083611 A1225057 AW1 96334 AI572254 AA761 623 AI472801 AA283784 

303751 468554 1 AA830149 AW978407 M85983 AW503637 

319401 1323199 1 W00973 N56457 AW992226T84921 R01342 

319402 1003489.1 R86913 R86901 H25352 R01 370 H43764 AW044451 W21298 
318807 1536467.1 F08434Z42573H28810 

319478 765461.1 AI524124 R06841 R06842 
318872 1534581.1 Z43108 F06295 R13085 



25 



30 



35 

318518 1205335 J 

318519 434741.1 
304168 72494.-10 

40 302948 21445.1 



319250 244351 1 
318644 17700.1 



.1 

304232 20640 J 
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318885 94880.2 

303841 79133J 

303889 17771 83_1 

319539 63198J 

318905 1536403.1 

320187 396254.1 

318996 65715.1 

319635 163534.1 

319699 747196J 

319713 1699356.1 

319761 75324.2 

319764 88596.1 

319808 7069.3 

321040 193331J 

320409 43709.1 



319881 1585983.1 

320488 368456 J 

321121 1545647.1 

321205 81249.1 

321253 375160.1 

314043 155125.1 

320630 17685.2 

313435 443527.1 

313443 82292.1 

313472 82811.1 

321348 41762.1 

314138 179960.1 

320712 57156.2 

321383 41924.1 

312996 187327.1 

306513 

306537 

306557 

306598 

306620 

306700 

308078 

306813 

306830 

306855 

329722 C14J>2 

32972B C14JJ2 

306890 

308100 

308147 

306929 

308352 

308383 

308521 

308561 

308617 

308771 

308828 

308896 

303019 41850.1 

303084 4421 1J 

305092 AA642912 

305169 

305177 

305235 

305413 



AA742999 Z43272 AA345258 AW955577 AA031942 

W19657 BE616760 BE259848 BE382680 BE615587 AI934464 AA322745 T07155 AW961174 AA307302 Z41888 AA621992 
AA188400 AW770608 AI147458 AI148408 A1696291 AA972591 
T19204 T36109 T36107 

R09Q27 AA344892 AA329574 AW955648 AW978708 AI567804 A1378935 AW014657 AI804134 R08922 N92947 BE546788 

F08365 Z43395 R54298 

T99949 AA654769 AA664550 AW975264 

Z44266 H06384AV655948 

R 17531 AW960899 AA338366 AW673294 BE047729 BE047722 AA330746 AW841797 H05O30 Al 142105 R12654 

AI458682 H24240 R14537 R18426 AW867082 

R24204R15712TB4695 

AW630974 BE005208 F©4237 AA724997 AA334867 AW955777 R1 8816 

AA019827R18947K46852 

T58960 AA6091 80 AA621 130 Al 927236 AA431075 

AA261830 AW967855 H26953 AA262478 

AA226869 AA296516 AW959753 AA186390 AL359619 AA356195 AA148427 R22748 AI033624 BE548853 H95327 
AW579751 BE561649 AA397533 BE617136 AA236444 T89946 AA247450 N55777 W38725 AI743846 AI808406 AA922229 
AI051464 W04713 R1 1251 W19656 A1042319 AA489276 AI224533 H95274 AW269958 T8931 1 AI890088 AI862754 
A1830968 AI669336 A1589780 AA534557 AW273839 AI338155 AI126632 N83542 BE046048 AA807028 AA848107 
AW167S78 AA976930 AA148428 A1289304 AB24262 A1625961 AA773469 AJ222288 A1280054 A1242371 AA227222 
AA973329 AA296517AA829436AA234526 AJ149769 AI567665 AA936939A1590681 AW469308AI689531 AA486419 
AI422051 AI057252 AA626941 AI475352 AW247913 A1222370 AA670122 A W1 98034 AA486418 AI363794 AA380739 
H51299 H44619 H46391 R86024 H51892 T72744 
AI817336 R32883 AA595590 AI743065 R31386 
W23285 H42714 F25381 F37215 
AA002047 N72537 H54142 H 81 580 
AA610649AI699484 H59558 
AA827082 AA732246 AA 167611 AA830741 

AA199847 AA410224 R53323 AW936567 AW936569 AW936568 AW936571 

AA769123 AA831715 AW977666 W92553 

AA005125 W95019 W93335 AA249037 

AA007374 AA007466 AI816886 

Z49979 D61703U3016B 

AA740616 AA654854 AA229923 

R66867 R65678 R82673 W73128 R83101 

AW968556 AJ238555 AW968731 AJ002574 AA459446 H70260 AW977557 AA767351 AW268572 AA810719 AI698677 

A1300460 AA907450 AA649224 T07415 AI536896 BE018515 AI279865 BE047421 

AW368634 AI702169 AI245179 AW368646 BB54S574 AA249018 AW368633 N27553 

AA989230 

AA991705 

AA994530 

A1000320 

AI000929 

AIQ22056 

A1472621 

AI066544 

AIG75803 

AI083982 



AJ092235 
A1475949 
AI498991 
AI124514 
AI610791 
AI624497 
AI689808 
AI701559 
AI738720 
AI809301 
AI824829 
AI858667 

AF098363AF098365 
AF174008AF174027AF174106 

AA663131 
AA663591 
AA67O460 
AA724659 
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305849 


AA861571 


305854 


AA862733 


307113 


AI183686 


307130 


AI1B5234 


305937 


AA883238 


305977 


AA887293 


307451 


AI248615 


307513 


AI274307 


307848 


AI364186 


307871 


AI368665 


307881 


AI370434 


307832 


AJ230822 


307944 


AI416246 


307954 


AM19692 


307965 


A1421641 


309245 


AI972447 


309271 


AJ986221 


309365 


AW072861 


309372 


AW074330 


309435 


AW090537 


309506 


AW137700 


309536 


AW151933 


309709 


AW242630 


325417 c12Jis 




325450 c12 hs 




325452 Cl2~hs 




309815 


AW292760 


309839 


AW29B076 


309849 


AW297444 


309906 


AW339340 


302705 31765 1 


U09060 U09061 


304037 


T26438 


304039 


T47349 


304236 


W83278 


304257 


AA053294 


304382 


AA232873 


304405 


AA282572 


304561 


AA489792 


304569 


AA490934 


304787 


AA582676 


304921 


AA603092 


327819 Q_5JtS 




304968 


AA614308 


306382 


AA968967 


331263 47479J 


AW7B0192 AA015718 W02571 


332252 1663967J 


N63882 T91174 
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TABLE 14B shows the genomic positioning for those primekeys lacking unigene ED's and 
accession numbers in Table 14. For each predicted exon, we have listed the genomic 
sequence source used for prediction. Nucleotide locations of each predicted exon are also 
listed. 



Pkey: Unique number corresponding to an Eos probeset 

Ret Sequence source. The 7 digit numbers in this column are Genbank identifier (Gl) numbers 

Strand: Indicates DMA strand from which exons were predicted. 

NLposta Indicates nucleotide positions of predicted exons. 



Pkey Ret 

332807 Dunham, I. elai 

332808 Dunham, I. eta!. 
332812 Dunham, I. etai 
332901 Dunham, I. etai 
333149 Dunham, L etaL 
333916 Dunham, I. etai. 
334026 Dunham, I. etai. 
334061 Dunham, I. eLal. 
334073 Dunham, I. elal. 
334150 Dunham, I. elal. 
334379 Dunham, I. etaL 
334719 Dunham, I. elal 
334773 Dunham, I. etai. 
334893 Dunham, L etai. 
334935 Dunham, I. elai. 
335146 Dunham, I elal. 
335320 Dunham, I et.ai 
335568 Dunham, I. elai 
335586 Dunham, I. elal 
335601 Dunham, I elal 
336036 Dunham, I. etai. 
336123 Dunham, I elal 
336268 Dunham, I elal. 
337173 Dunham, I. elal 
337460 Dunham, I elai. 
337685 Dunham, I. etai. 
337736 Dunham, I elal. 
337780 Dunham, I. elal. 
337965 Dunham, I. elal. 
337976 Dunham, I. elai 
338030 Dunham, I etai 
338112 Dunham, I elal 
338165 Dunham, I elai. 
338178 Dunham, I. elai. 
338427 Dunham, I elal. 
338506 Dunham, I. etai. 
338794 Dunham, I. etai. 
338910 Dunham, I. etai. 
339047 Dunham, I. elal. 
332864 Dunham, I. elal 
332933 Dunham, I. etai. 
333193 Dunham, I. elal 
333712 Dunham, I elal 
333940 Dunham, I. elal 
333942 Dunham, I. elal 
334287 Dunham, I elal 
334387 Dunham, I elal 
334487 Dunham, L etai 
334913 Dunham, I. etai 
335109 Dunham, I etaL 
335250 Dunham, I etai 



Strand 


NLposltion 


Plus 


297686-297808 


Plus 


298277-298360 


Plus 


309688-310561 


Pius 


1841954-1842090 


Plus 


35743174574413 


Plus 


8298994-8299169 


Plus 


9196549-9196681 


Pius 


9686941-9687077 


Plus 


9792201-9792374 


Plus 


1052922M0529B54 


Plus 


13908356-13908467 


Plus 


15778859-15779026 


Pius 


16235169-16235328 


Pius 


19302753-19302881 


Plus 


20108247-20108373 


Plus 


21491292-21491457 


Plus 


22542132-22542246 


Plus 


24935021-24935655 


Plus 


24990333-24990497 


Plus 


25044923-25045157 


Plus 


29019796-29019877 


Plus 


30051089-30051186 


Plus 


3199755541998040 


Plus 


23624127-23624224 


Plus 


32536159-32536395 


Plus 


35471614547245 


Plus 


38505004850643 


Plus 


41137934113890 


Plus 


7034267-7034392 


Plus 


7166011-7166119 


Pius 


80727084072827 


Plus 


10391398-10391600 


Plus 


12205719-12205875 


Pius 


12800037-12800181 


Pius 


19685043-196B5354 


Plus 


21221871-21221953 


Plus 


27114697-27114763 


Plus 


28795375-28795551 


Plus 


30760783-30760968 


Minus 


1390386-1390296 


Minus 


2035790-2035681 


Minus 


3832993-3832494 


Minus 


7286177-7286073 


Minus 


85238304523671 


Minus 


65526294552330 


Minus 


13294116-13293871 


Minus 


13946021-13945781 


Minus 


14432191-14432132 


Minus 


19463909-19463815 


Minus 


21325792-21325667 


Minus 


21952922-21952826 
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335286 Dunham, L elaL 
335290 Dunham, I. etaL 
335549 Dunham, i. etaL 
335862 Dunham, I. etaL 
5 335864 Dunham, total. 
335905 Dunham, total. 
336205 Dunham, I. etaL 
336276 Dunham, total. 
336433 Dunham, I. etal. 

10 336605 Dunham, I. etaL 
336616 Dunham, I. etal. 
336679 Dunham,!. etaL 
337043 Dunham, I. etaL 
337272 Dunham, i. etaL 

IS 337357 Dunham,!. etaL 
337393 Dunham, I. etaL 
337497 Dunham, L etaL 
337646 Dunham,!. etal. 
337920 Dunham, I. etaL 

20 338083 Dunham, L etaL 
338220 Dunham, I. etaL 
338752 Dunham, I. etaL 
338763 Dunham, L etaL 
338983 Dunham, t etal. 

25 339209 Dunham, t etaL 
325240 5866848 
329532 3983505 
329522 3983507 
329519 3983510 

30 329511 3983514 
325326 5866875 
325303 5866908 
325389 5866921 
325417 5866925 

35 325450 5866941 
325452 5866941 
325498 5866967 
325587 6682462 
325602 5866994 

40 325701 5B67G28 
325780 6381953 
329722 6065785 
329728 6065785 
329666 6272129 

45 329815 6624888 
329841 6572062 
325824 5867048 
325866 5867076 
325902 5867101 

50 325958 5867142 
326014 5867160 
329941 6165199 
330002 6623963 
326154 5867170 

55 326023 5867245 
326278 5857269 
330036 6042048 
326547 5867307 
326495 5867423 

60 326507 5867435 

326505 5867435 

326506 5867435 
326530 5867441 

„ 326508 6682496 
65 330120 6671864 
330123 6671869 
326858 6552462 
326983 5867657 
327014 5867664 



Minus 


22304275-22303770 


Minus 


22309950-22309891 


Minus 


24666203-24666128 


Minus 


26690300-26690125 


Minus 


26694537-26694382 


Minus 


26988888-26988719 




30477456-30477311 


Minus 


32093320-32093181 


Minus 


34067640-34067425 


Minus 


15616509-15616358 


Minus 


26021027-26020848 


Minus 


203579O-2035681 


Minus 


17407330-17407251 


Minus 


28241476-28241307 


Minus 


30908179-30906109 


Minus 


31471747-31471569 


Minus 


33371317-33371258 


Minus 


2648689-2648632 


Minus . 


6051648-6051510 


Minus 


9318438-9318301 


Minus 


14166440*14166104 


Minus 


26421374-26421135 


Minus 


26628148-26628009 


Minus 


29908865-29908702 


Minus 


32492953-32492593 


Minus 


32301-32650 


Plus 


42937-43014 


Minus 


35265-35458 


Plus 


18407-18597 


Pius 


20965-21325 


Plus 


47726-48024 


Minus 


73556-73630 


Plus 


239672*239759 


Minus 


110635-110745 


Minus 


435379-435552 


Minus 


704103-704202 


Plus 


173372-173930 


Plus 


126724-126967 


Plus 


79122-79251 


Minus 


72936-73046 


Pius 


63634-63873 


Minus 


112713-112992 


Minus 


207544-207741 


Pius 


98307-98446 


Minus 


68431-68720 


Minus 


40181-40331 


Minus 


42450-42633 


Minus 


94333-94626 


Minus 


127729*127842 


Plus 


53437-53550 


Minus 


10358-10447 


Minus 


34319-34411 


Plus 


46097-46158 


Minus 


7103-7179 


Plus 


171799-171896 


Plus 


75250-75903 


Plus 


117120-117216 


Minus 


623677-623870 


Pius 


11843-11930 


Minus 


13038-13111 


Minus 


88184949 


Minus 


9368*9509 


Minus 


303000-303122 


Plus 


78904-79112 


Minus 


127553*127656 


Minus 


35311-35406 


Minus 


69337-69670 


Minus 


16023-16581 


Plus 


1017630-1017788 
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326930 6456782 


PhiS 


606950-607705 


326920 6456782 


Minus 


4242542519 


327058 6531965 


Plus 


2384268*2384835 


327061 6531965 


Minus 


34863894486673 


327075 6531965 


Pius 


40413184041431 


327120 6531970 


Minus 


6-1088 


330126 6093735 


Plus 


82458-82623 


327157 5866841 


Minus 


44084746 


327183 5867442 


Pius 


84317-84531 


327192 5867445 


Minus 


194652-194764 


327288 5867481 


Plus 


4858348773 


327469 5867772 


Pius 


145549-145708 


327489 6004459 


Minus 


57796-58015 


327526 6381882 


Minus 


97010-97123 


327574 5887818 


Plus 


68767-69126 


327665 5867839 


Pius 


141736-141900 


327752 5867949 


Plus 


93721-94421 


327819 5867968 


Minus 


92202-92717 


327796 5867982 


Plus 


85267-85405 


330260 6671884 


Plus 


4520345269 


330282 6671910 


Plus 


39824114 


328078 5868008 


Plus 


72807-72865 


328121 5868031 


Plus 


153782-153850 


328190 5868077 


Plus 


21082-21165 


328227 5868105 


Minus 


21082-21242 


327871 5883131 
328018 5902482 


Minus 
Minus 


88889-89221 
542547-543133 


328624 5868246 


Minus 


120666-120836 


328744 5868290 


Plus 


138639-138722 


328799 5868316 


Minus 


80771-80923 


328291 5868363 


Minus 


144244-144434 


328329 5868375 


Plus 


191709-192239 


328369 5668388 


Plus 


75371-75583 


328385 5868395 


Plus 


369952-370155 


328397 5868397 


Pius 


344967-345063 


328412 5868405 


Pius 


66427-86519 


328538 5668485 


Pius 


38144243 


328656 6004473 


Plus 


792616-792729 


328638 6004473 


Plus 


294618-294903 


328903 5868514 


Plus 


23625-24468 


328960 6456775 


Plus 


38547-38837 


330320 5932416 


Minus 


54458-54697 


328993 5868536 


Plus 


4916060084 


329081 5868602 


Plus 


93368-93510 


329089 5868614 


Pius 


25805-26923 


329109 5868626 


Plus 


102168-102273 


329192 5868716 


Plus 


166936-167020 


329218 5868726 


Minus 


71408-71707 


329224 5868728 


Plus 


27422-27664 


329246 5868732 


Minus 


250541-250792 


329415 5868874 


Plus 


1011438-1011818 


329454 5868887 


Pius 


51342-51593 
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TABLE 15: 169 GENES WITH SEQUENCE INFORMATION DEPICTED IN TABLE 16 

Table 15 depicts UnigenelD, UnigeneTitle, Primekey, Predicted Cellular Localization, and 
Exemplar Accession for all of the sequences in Table 16. The information in Table 15 is 
linked by EosCode to Table 16. 



Pkey: Unique Eos probesat Identifier number 

ExAocn: Exemplar Accession number, GenbanJc accession 

UnigenelD: Unigane number 

10 UnigeneTitle: Unigene gene title 

EosCode: Internal Eos name 

Localization: Predicted cellular localization of gene product 



15 Pkey ExAccn UnigenelD Unigene Title 



100394 
100452 
101249 
101485 
101514 
101851 
102398 
102522 



103119 
103709 
104080 
104144 
104691 
105370 
106149 
106579 
107102 
107217 
108153 
109014 
109112 



20 



25 



30 



35 



40 



45 



50 



55 



60 



127637 
128790 
129109 
65 129184 



D84276 Hs.66052 CD38 antigen (p45) P6C1 
D87742 HS241552 K1AA0268 protein PAB7 
L33881 Hs.1904 protein kinase C, iota OAA1 
M24736 sdiecfa'n E (erufoftefial adhesion motecui ACC5 

M28214 Hs.123072 RAB3B, member RAS oncogene family PFJ2 
M94250 Hs.82045 midkine (neurfte growtJvfirornoting factor LBH9 
U42359 g&Human N33 protein form 1 (N33) gene, PDG3 

U53347 Hs.183556 solute carrier family 1 (neutral amino a PFJ4 
U71207 H&29279 eyes absent (Drosophila) homofog 2 LEM9 
X63629 H&2877 cadherin 3, type 1, P-cadherin (placenta LBG2 
AA037316 Hs. 13804 hypothetical protein AJ4620232 POOS 
AA402971 Hs.57771 kalkreinU PBA6 
AA447439 Hs.183390 hypothetical protein RJ1 3590 P0M3 
AA011176 Hs^7744 Homo sapiens beta-1 adrenergic receptor PAV1 
AA236476 Hs.22791 transmembrane protein with EQF-Oke and PDM9 
AA424881 Hs256301 hypothetical protein MGC1 31 70 PD08 
AA456135 Hs23023 ESTs PAA4 
AA609723 Hs£0652 K1AA1 344 protein PAA3 
D51095 DKFZP586E1621 protein P0G8 

AA054237 Hs.40808 ESTs PBF1 
M1 56790 H&262036 ESTs, Weakly similar to Z223_HUMAN ZINC 
AA169379 H&257924 hypothetical protein RJ13782 BCU4 
H04649 Ha20843 Homo sapiens cDNA RJ1 1245 fis, done PL 
H18836 H&31608 hypothetical protein RJ20041 PAV9 
T17185 Hs.83883 transmembrane, prostate androgen induced 

Hs.129836 K1AA1028 protein PD03 
HS54973 cadhenMke protein VR20 PFJ6 
Ha.72472 ESTs BCY2 
hypothetical protein MGC2648 PDV3 
ESTs OAB6 
Hs.45107 ESTs PDT9 
Hs.106778 ATPa^,Ca++tnans|M)m , ng 1 type2C,memb 
Hs.55028 ESTs, Weakly similar to 154374 gene NF2 POM8 
Hs^78635 Homo sapiens prosteln mRNA, complete cds 
Hs.117183 ESTs PBFB 
Hs.97594 KIAA1210 protein PDG5 
prostate androgerweguiated transcript 1 POV5 
ESTs; protease inhibitor 15 (PI15) BCU7 
Hs.98732 Homo sapiens Chromosome 16 BAG clone CIT 
Hs.128749 alpha-rre%lacyl-CoAracemase PD01 
Hs.203270 ESTs, Weakly similar to ALU LHUMAN ALUS 
Hs£93185 ESTs, Weakly similar to JC7328 amino aci PAV4 

transmembrane, prostate androgen induced 
Hs.61635 six transmembrane epithelial antigen of PAA5 
Ha.1 82575 solute carrier family 15 (HWpeptide tra PD05 
Hs.162859 ESTs PAA6 
Hs.105700 secreted frizzled-related protein 4 BCX2 
Hs.108708 calciunVcairrKxyirKlependertprotslnWn PFJ7 
Hs.109201 CGW6 protein PAV6 
129389 AA621604 spondin 2, extracellular matrix protein CJA5 
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plasma membrane 
not determined 



110151 
112971 
113021 
114908 
114965 
116393 
116416 
117698 
117984 
118985 
119018 
119126 
120992 
121710 
121913 
122041 
122593 
123209 
124526 



126645 



AA236545 

AA250737 

AA599463 

AA609219 

N410Q2 

N51919 

N94303 

N95796 

R45175 

AA398246 

AA419011 

AA428062 

AA431407 

AA453310 

AA489711 

N62096 

AA128075 

AI167942 

R38438 

AA569531 

AA291725 

AA491295 



plasma membrane 
cytoplasmic 



plasma membrane 
cytoplasmic 
plasma membrane 



plasma membrane 
plasma membrane 

p^sma membrane 
not determined 

plasma membrane 
PDG7 

not determined 
PDG4 

ptasma membrane 
CHA1 not determined 

ptasma membrane 
mitochondrial 



ER 

PAJ5 not determined 
-PAB2 plasma membrane 



vesicular 

PAZ1 not determined 

PAA2 plasma membrane 
plasma membrane 
PDY4 

plasma membrane 
plasma membrane 
not determined 
secreted 

vesicular 
not determined 
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10 



15 



20 



25 



30 



35 



40 



55 



60 



65 



129404 
129534 
130760 
131426 
132964 
132967 
133179 
133330 
133520 
133724 
133724 
133944 
134110 
301805 
302005 
302681 



AA172056 

R73640 Ks.11260 

AA128997 Hs.18953 

AA219134 Hs.26691 
AA031360 

AA032221 Hs.61635 
Hs.66731 



ESTs 

hypothetical protein RJ11264 



U42360 
X74331 
U07919 
U07919 



Hs.71119 
Hs.74519 
Hs.75746 
Hs.75746 



303753 
308050 
310382 
310431 
310573 
310598 
310816 
311596 
313676 
314121 
314691 
314785 
314907 
315051 
315052 
316442 
317548 
317869 
318524 
319191 
319763 
320324 
320561 
320796 
321441 



45 322782 
322818 



50 



324295 
324430 



324617 
324626 



32471B 
330211 
330546 
330762 
330790 
330892 
331099 
331490 
331689 



AA045B70 Hs.7780 
U41060 Hs.79136 
AJ 800004 Hs.142846 
AI869666 H&123119 
AA508353 Hs.105314 
AA340605 Hs.105887 
D30891 Hs.19525 
AW503733 HSJ9414 
A1460004 Ha31608 
AI734009 Hs.127699 
A142Q227 Ha.149358 
AW292180 Hs.156142 
AJ338013 Hs.140546 
A1973051 H&2249B5 
AI682088 Hs.79375 
AA861697 Hs.120591 
AI732100 Hs.187619 
AW207206 Hs.136319 
AI538226 H&32976 
AI672225 H&222866 
AW292425 

AA876910 Hs.134427 
AA760894 Hs.153023 
A1654187 Hs.195704 
AW295184 Hs.129142 
AW291511 Hs.159066 
ARJ71538 

AA460775 Hs.6295 
AF071202 Hs.139336 
NM_006953Hs.159330 
AF038966 Hs.31218 
AW297633 Hs. 118498 
W07459 Hs.157601 
AA056060 H&2Q2577 
AW043782 H&293616 
AFD55019 H&21906 
AA6399Q2 Hs.104215 
AI146686 Hs.143691 
AA464018 Hs.184598 
AW016378 H&292934 
AA508552 Hs. 195839 
AI685464 

AI694767 Hs.129179 
AI557019 Hs.116467 



U31382 

AA449677 

T48536 

AA149579 

R36671 

N32912 

AA431407 

N58172 

AA340504 

T94885 



Hs.299867 

Hs.15251 

Hs.122764 

Hs.91202 

Hs.14646 

Hs.291039 

Hs.88802 



332798 
334447 



PAB4 
PAJ3 
PEE6 

ESTs PBA7 
ESTs PAA7 
six transmembrane epBhelia) antigen of PM17 
homeoboxB13 PFJ5 
Putative prostate cancer tumor suppresso PDM1 
primase, polypeptide 2A (58kO) PDM2 
aldehyde dehydrogenase 1 family, member 
aldehyde dehydrogenase 1 family, member 
Homo sapiens mRNA; cONA DKFZp564A072 (fr 
UV-1 protein, estrogen regulated BCR4 
hypothetical protein PEU4 
MAD (mothers against decapentaptegic, DrPBJ6 
ralaxJn1(H1) PBH3 
ESTs,WeaJdysimlIartoHomologofratZ PEG4 
hypothetical protein HJ22784 PBM4 
KIAA1468 protein PBY3 
hypothetical protein FU20041 PEU5 
KIAA1603 protein PCQ8 
ESTs, Weakly similar to A460 10 X-flnked PBH1 
ESTs PEN3 
ESTs PCW3 
ESTs PETS 
holocarboxytase synthetase (bk>tnv{prop PBH8 
ESTs PBY2 
ESTs PBY1 
ESTs BFF8 
guanine nucleotide binding protein 4 CB07 
ESTs, Weakly similar to TRHY_HUMAN TRICH 
ESTs PBM9 
ESTs PBJ7 
ESTs PBJ9 
ESTs PBQ6 
deoxyrfoonuclease II beta PBQ7 
hypothetical protein RJ101 88 PBJ1 
prostate epithefiurn-specific Ets transcr PEN1 
ESTs, Weakly similar to T1 7248 hypothefi PE07 
ATP-binding cassette, sub-family C (CFTR PBH5 
uroplaWn 3 Pa9 
secretory carrier membrane protein 1 PBY4 
Homo sapiens LUCA-15 protein mRNA, spflc 
ESTs CBF9 
Homo sapiens cONA RJ12166 fis, clone MA 
ESTs PCQ7 
Homo sapiens clone 24670 mRNA sequence 
ESTs, Moderately similar to SPCNHUMAN S 
ESTs PBQ9 
Homo sapiens cONA: FU23241 lis, clone C 
ESTs PBM3 
ESTs, Weakly similar to 138022 hypotheti PBH4 
gb.it88f04jd NCLCGAP_Pr28 Homo sapiens 
Homo sapiens cDNA FU13581 fis, done PL 
small nudear protein PRAC C6K1 

PBJ2 

guanine nucleotide binding protein 4 PEW4 
hypothetical protein PBM1 
TMPRSS2, transmembrane protease, serine 
ESTs PBQ4 
Homo sapiens mRNA; cDNA DKFZp564D016 (fr 
ESTs PCM 
ESTs,Moderatelysim2artoT14342NS01 PBH7 
gbza21 f09^1 Scares fetal liver spleen PBQ5 
gtemv31a09.x1 NCI.CGAPJOdll Homosapien 
transgeTin2 PBQ8 
. PBH2 
PBY9 
PBY7 



nuclear 

plasma membrane 
plasma membrane 
nuclear 



POT1 mitochondrial 
POT1 mitochondrial 
PAB9 cytoplasmic 
plasma membrane 
nuclear 
cytoplasmic 



not determined 
not determined 
plasma membrane 

plasma membrane., 
plasma membrane 



not determined 
cytoplasmic 
PBM2not determined 

plasma membrane 



cytoplasmic 



plasma membrane 
plasma membrane 
not determined 
PBY8 not determined 



PBQ1 not determined 
plasma membrane 
PCG not determined 

pais 

not determined 
PBY6 not determined 

cytoplasmic 
*PCW6 

PBJ4 plasma membrane 

nudear 

not determined 

cytoplasmic 

not determined 

PEL3 plasma membrane 

plasma membrane 

PCQ1 cytoplasmic 

nudear 

not determined 



PBJ8 not determined 

secreted 

nuclear 

not determined 
not determined 
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401424 PFQ2 
407122 H20276 Hs.31742 ESTs PEW7 
408430 S79B76 Hs.44926 dipeptidylpeptidase IV (CD26 t adenosine PEZ3 
408826 AF216077 Hs.48376 Homo sapiens done HB-2 mRNA sequence 
5 409262 AK000631 H&52256 hypothetical protein FU20624 PFG1 
409361 NMJX35982H&54416 sine ocuSs homeobox (Drosophila) Komolo PEW3 
411096 U80034 Hs.66583 mitochondrial intermediate peptidase PEZ9 
413125 BE244589 Hs.75207 gtyoxalasel PFJ3 
413623 AA825721 H&246973 ESTs OBH6 

10 414422 AA147224 H&337232 HomeoboxA13 PFC6 
415263 AA946033 Hs.130853 ESTs PEZ5 
417153 X57010 Hs*1343 "collagen, type II, alpha 1 (primary ost PFJ1 
418601 AA279490 HsJ6368 caimegin PFA1 
418848 AI820961 Hs. 193465 ESTs PEY4 

IS 418882 NNL004996Hs*9433 ATP-blmfing cassette, sub-family C (CFTR OBH2 
41S839 U24577 Hs.93304 •phospholipase A2, group VII (platelet-a PFH9 
421 687 AW161450 Hs.109201 CGI-86 protein PFH2 
422083 NM.001 141 Hs.1 11256 'aracWdonate 15-Dpoxygenase, second ty PFH5 
424565 AW102723 Hs.75295 guanyiata cyclase 1, soluble, alpha 3 PFA3 

20 425071 NM.013989HS.154424 'deJodinase, iodothyronine, type II' PFH6 
425710 AFO3O880 solute carrier famfly, member 4 PFD4 

427958 AA418000 H&98280 potassium Intermediate/smali conductance PFH1 
428619 AL135623 Hs. 193914 K1AA0575 gene product PFD6 
429800 AA460421 H&30875 ESTs PEZ7 

25 429918 AW873986 Hs.119383 ESTs PEY5 
430226 BE245562 Hs£551 adrenergic, beta-2- f receptor, surface PEZ4 
431217 NMJ)13427Hs25Q830 RhoGTPase activating protein 6 PFQ6 
431716 D89053 . Hs268012 fatry-add-Coenzyma A iigase, long-chain PEZ1 
431992 NM_002742Hs2891 protein kinase C, mu PFH4 

30 432189 ' AA527941 gb:nli3Qc04.s1 NCI_CQAP_Pr3 Homo sapiens 

432244 AI669973 Hs£00574 ESTs PEWS 
432437 W07088 Hs593685 ESTs PFG3 
432966 AA650114 Hs£25198 ESTs PEY3 
439176 AI446444 Hs.190394 ESTs, Weakly similar to B28096 llne-1 pr PEWS 

35 440260 AI9728S7 Hs.7130 copineiV PEW6 
440901 AA909358 Hs.128612 ESTs PFC8 
445424 AB028945 cortactin SH3 ctomam-bindrng protein PEZ6 

446320 AF126245 Hs. 14791 "acyt-Coenzyme A dehydrogenase family, m 
447210 AF035269 phosphat'dylserme-spectfic phcspholipas PFH8 

40 449156 AF103907 H&171353 prostate cancer antigen 3, non-coding DD PEZ8 
449625 NM_014253 odz (odd Ozften-m, Drosophila) homotog 1 PEZ2 

449650 AF055575 H&23838 calcium channel, voltage-dependant, L ty PFD2 
451939 U80456 H&27311 single-minded (Drosophila) homoiog 2 PFJB 
451982 F13036 H&27373 Homo sapiens mRNA; cDNA DKFZp56401763 (f 

45 452039 AI922988 ESTs PFD8 

452340 NlvL0Q2202Hsi05 ISL1 transcription factor, UM^omeodoma PFG4 
452784 BE463857 Hs.151258 hypothetical protein RJ21062 PFC5 
452946 X95425 HS31092 EphA5 PFH3 



plasma membrane 

PEY1 

nuclear 



mitochondrial 
cytoplasmic 



secreted 
ER 



secreted 

plasma membrane 



secreted 

plasma membrane 
plasma membrane 
nuclear 



plasma membrane 



cytoplasmic 
PFA2 



PFH7 

plasma membrane 
plasma membrane 

PFG9 plasma membrane 

nuclear 
cytoplasmic 
plasma membrane 
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TABLE 15A shows the accession numbers for those primekeys lacking a unigenelD in Table 
15. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from 
Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity 
using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank 
accession numbers for sequences comprising each cluster are listed in the "Accession" 
column. 



10 



15 



20 



30 



40 



45 



50 



55 



60 



65 



Pkey: Unique Eos probeset Identifier number 

CAT number Gene duster number 

Accession: Genbank accession numbers 



25 101485 16113.1 



126399 17331.1 



132984 
35 129389 



Pkey CAT number Accession 

116393 131543J A1972402 AIS34409 AJ523716 AI799749 W44518 A1424438 A1688513 Al 97 1048 AI685324 AW013854 AA588483 AA5281 1 1 AI627428 
A1582200 AI669296 A1826926 A1620526 AI669956 AI972458 AI924500 AA512903 W44517 AA335363 AW238997 6E3001 65 
BE250665 AA2841 95 AA523420 W52834 AI471 970 A1952824 AW003820 AW009463 AA669796 AA1 14986 AJ653342 AA1 15038 
AI342150 AI092100 A1968211 W51994 AI804005 A1201420 A1123210 AI738405 AI674954 AI970341 AW02750O AJ49331 6 AI3331 93 
AI139353 AA599463 AI8561 63 AI804200 AI365321 AJ 9 9 02 13 AI65701 1 AA650025 A06881 0AI341978 AA599839 AW592602 
AA644289 AI468578 A156S265 A1565228 BE221535 AW973052 

AA296520 AL021 940 M30640 NM.00O450 M24736 M61894 AL047443 H39560 AI694691 AA916787 AJ21479B AA939085 Al 15061 6 
AA412553 AA412545 AJ051015 T27S54 AA694430 

AA088767 AF224278 AA128075 AL035541 AA027926 AT761441 A1972096 AW071693 AI742327 AI377498 AI80481 5 AI6408Q2 
AI885001 AI921394 AA5951 15 N71820 AI921217 AW007283 AI467828 AI369306 AA917446 AI493698 AA088701 AA126899 AI936228 
AW204238 AI039567 AI925027 BE138909 AW452945 AW 135993 AA310934 AAQ27B60 AW07351 9 AI537597 AA953976 AJ521 341 
AW273569 AW050740 AA536113 AA559064 AI474392 AW135709 AA535181 AW572959 AA570597 AI9Q5464 A1677B10 A1587642 
AW9751Q2 AA424310AA482527 N64192 AA658276 AW889117AA486591 AW889172A1381990 AI381991 A1673419 AI990950 
AA487031 A1272934 Al 150565 AA229168 AW316722 AI1 42707 BE222396 AA6141 68 M122026 AW338227 AA632457 AI968726 
AW369662 AA512956 AA541675 AA451748 AI250993 BE146418 AA122Q25 
A1362575 AI805082 AW263421 AI432462 AA135870 AA031360 AA031604 AA298475 AA298464 

NM.012445 ABQ27466 BE407510 BBD47605 AA047125 AW084003 AA149494 AA149490 AA292528 AA570505 AA526186 AW006250 
AW007762 AI341557 AI799666 AI972710 A1377966 AI962810 AE084783 AI458032 AI190971 AW148913 AA372354 AW970032 
AW007426 AA6501 88 AH 23203 A1122890 AI280975 W73595 W73495 A1863238 AA374109 AA603986 AW149089 AW957523 
AI307748 AI921067 AI336463 F24537 A1380460 AI367500 Al 189309 AI814701 AI765921 AW572106 AA037024 AW072576 AA578293 
AK288103 AA235464 AW450642 AA574230 AW294024 A1589229 AS80733 AW512227 AA877009 AI66Q255 AW1 88597 AA558228 
A1572782 AA658397 AI274628 AI866359 AA864573 AI264439 AA621604 AW515493AW243333 Z39737 AI567038 AA573997 
AA573559 AW236431 AI652B70 AI684973 AA034505 AA047126 

AK67700 AI720344 AA191424 AI023543 AI469633 AA1 72056 AW958465 AA172236 AW953397 AA355086 
AL080235 AA031750 D81382 AI480231 AJ095947 AI560953 BE010721 AI87Q290 AA374945 AA125792 D51527 D51555 AI685541 
D51559AW117286M195741AI675138AW593439AI20tt^ 
AI205532 AA127069 AI337357 051 595 AI453785 AW075677 AW088359 C14287 C14284 

AF1 63474 NM.016590 AF163475 AI761105 AI770098 AA410580 AA41 1616 AI590343 AI739050 AL050198 AI862645 AA419104 
AA513809 AA333032 AI816915 AW139625 AA640889 AI311391 AI627693 AW135514AA419011 AI269149AI245259 AJ970008 
AI970017 AW139445 AA569503 AI761072 AI7661 79 A1759995 A1300776 AI870129 AW150770 AA226501 AA226220 
A1249388 AI742316 AA428062 AA442089 AI864189 BE349478 A1803475 AI584049 BE552085 AI088609 AI264197 AI886144 Al 129474 
AI307145 BE181300 AW058403 AI696838 AW748598 AA442196 AI216428 
entrezL.U42359U42359 

347217J AW292425 BE467167 A1702953 BE550961 BE222309 AI299348 AI693336 AA541708 
A1685464 AW971336 AA513587 AA525142 

NMJM2391 AF071538 AB031549 AI685592 AT745526 AA662204 AW 130657 AA662164 AW971 121 AI668916 AA513274AI991223 
AI979170 AW298436 AA639821 AI859010 AW513942 AI687669 AA662521 AA548598 AI345056 AI305374 BE043418 AI432B56 
AI334840 AI379796 AI492693 AI307915 BE042082 A1307834 AI307858 AI309488 BE042210 A1435670 AI371605 AI862491 A1284563 
AI306872 AI255044 AJ254601 AI251236 AJ473073 AW73042 AI432760 AI435664 AJ336826 AI289365 AI369096 AI862274 AJ 334871 
AI349863 AI250405 AI377617 A1309895 AI313017 AI862291 AI31 1936 AI378718 AI305722 AI306769 A1308888 AI334565 AI862296 
AI344230 A1435685 AJ344087 A1378696 AI311209 AI435775 AI310611 AI311154 A1432289 AI431561 AI492681 A1432867 AI335288 
AI492796 AI432769 AI310299 AI432273 AI379820 AI275319 AI435753 A1609441 A1432767 AI369100 AG 11420 AI349974 AI247157 
AI334677 AI270910 AI224320 AI305608 A1334489 A13771 52 AI350012 A1370086 AI335053 A1306781 A1306750 AI334849 AI334874 
AI340380 AI307876 A1305974 AI305972 AI311521 AI334872 AI862509 A131 1 498 AI335051 A1289684 AI310859 AJ311862 AI862483 
AI492775 AI307906 AI492708 AI289693 AI340373 AI307910 A131 1359 AI435653 A1334865 A131 1492 AI492809 AI492690 AJ431576 
AI862268 AI311879 AI308435 A1492792 A1862512 AI275321 AI431568 AI431564 A1307885 AJ307826 AI435692 AI435778 A1310182 
AI308894 AI492707 Al 49271 3 AI308560 AI307829 AI343234 AI580598 AW472798 AI340918 AB1Q243 AJ309368 AI307920 AI269665 



129404 
107217 



94346.1 
21074J 



156454 1 
9336.1 



121710 19266_1 



121913 291015J 

102398 
315051 

324626 33841 1_1 

319191 16065J 
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330211 
332798 
334447 
332247 



AI306777 AWQ8631 8 AW086292 AW086378 AI310027 A1275293 AI369082 AI340900 A1306749 AJ 37 1558 AW086287 BE0438Q3 
AI306793 AI306272 A1287948 AJ270917 AI284816 AI336B13 AI284546 A1308044 AI275290 AI270872 AI306795 AE89687 A1223570 
AI305303 A12B9677 AI287742 AI275284 AI306812 AI336701 AI371554 AJ378719 AI344988 A1223831 AI335141 AB43222 AI284568 
AI305357 A1275270 AI345932 AJ43B549 A1307925 AI31 1502 AI344238 AI3431 82 AI308508 AI305988 AI270790 AB79792 AI305647 
AI305410 AI432251 AI436517 A1343227 A1305534 A1340387 AI271043 AI305499 AI271046 A1305962 A1289465 AI305378 AI289725 
AI310848 AI305848 AI289382 A1252964 AI307049 AI310831 AI306993 AB06796 AI224659 AI305969 AI349855 AS306164 AI306948 
AI2S4676 AI309155 AI343202 AI432785 AI306815 AI369081 AI270885 A1269699 AI435704 A1309647 AI305716 AI31 1281 AI287927 
AI472995 AJ340423 AI270958 AI307089 A13C6364 AK70807 AI275306AI311B90AI275263AM32750AJ289371 AJ432861 A125S113 
AI305709 A1473008 AJ311168 AI30971 1 A1377164 AI271201 AI289560 AI309710 AI306195 AJ31 1201 AI287741 A1271066 AI432876 
A1275281 AJ379795 AI472972 AI311967 A1306826 AI305465 A1270792 AI473019 A1305340 A1270822 AI305995 A1305462 AI254144 
AI270969 AI473012 A1305390 AI2TO278 A1223644 A1289692 AI250318 A1305372 AI289691 A1250521 A1306283 AI306B14 AJ307933 
AI4731 60 AI432903 A1223720 AI254979 AI334862 AI306926 AI289541 AI432248 AI435722 AI435698 A1432859 AI310683 AI473175 
AI3351 44 AC89467 AI436489 A1306928 AW73033 AI305783 AI307888 A1307882 AI348959 AI435738 AI432857 AM32896 AI435735 
AI432283 AI473086 AI432863 AW73081 AI432825 AI307840 AI473164 AI432885 AI473166 AI472982 AI435734 A1473060 AI473171 
A1432279 AI432882 AI334670 A1438512 AI432827 AI432852 AW73051 A1473077 AM35697 AJ271 509 AW92781 AM72983 AJ47301 8 
A1432897 A1473043 AI432871 AI436536 AI473157 AI349715 AI432777 AJ473016 A1473158 A1340369 AI307941 A1432773 A1377146 
AI492791 AI270950 AI305342 AE84604 AB06269 AI28481 1 AI27081 1 AI289347 AI334869 AI334852 At31 1759 A12S0382 A1309520 
A1289550 A1305721 AI340870 AI270901 AI308575 AI307904 A1340715 AI270941 AI309808 A1246867 AI473014 AI307039 AI289380 
AJ473069 AI492766 AI344013 AI305876 AI436510 AI340742 AI473028 AI307891 BE041871 BE041288 BE042340 BE041946 
BE041783 AI306173 AI201948 AI926972 A1275769 

CH22JB56FG_UNK_EM:AC00 

C_5_p2 

CH22 J4FGL6J5JJNKJC4G1 .G 
CH2a_1746FG.387J7_UNILEM 



.1 



332697 13699J 



425710 
432189 
445424 



25529.1 
342819 1 
6391J 



447210 7119J 



8113J 



452039 89513J 



AA669097 AA513815 AAQ2679B AA676526 AA704429 AA704269 AW118292 AA579216 N58172 
AW579842 BE156562 BE156690 BE156489 BE081033 AK001559 BE149402 M85387 AW36781 1 AW367798 R17370 AI908947 
AA382932 R58449 H18732 AA371231 AW982899 AA713530 AW892946 R53463 H1 1063 AW068542 Z40761 BE176212 6E176155 
W23952 W92188 AW374883 AA303497 AW954769 AA036808 BE168063 AW382073 AW382085 AL041475 H80748 AI078161 
BE463983 AJ805213 AI761264 W94885 N945Q2 AI623772 AI419532 AI810302 AJ 634190 AW002516 AW 150777 AI352312 A1367474 
AW204807 AI6755Q2 AI337026AW134715BE328451 AI123157 AI560020 AI300745 AJ608631 A1248873 AA742484 AW051635 
H18646 AI245045 AA5071 1 1 AI64051 0 AI925594 AA1 15747 AA143035 AA151 106 

X51405 NM.001873 T11322 AL118886 BE328175 AW136009 BE467445 AW470313 AA774852 BB04139 AW501046 AA082792 
AW389231 AA370044 R36841 AA371457 C04813 R25791 R25556 AW895854 AW903819 AW895671 AW895677 BE159723 
AW895564 AW895597 AW895595 AW895665 AW888518 AI903724 FO6081 F08503 AL1 19462 AW895730 AW88851 8 R2651 1 
R26489 AA334126 AA327626 N85713 AW895998 AA223622 F05468 AA370749 W05590 M78202 AA371073 AW498607 R15017 
T16991 AA001282 AA001 138 AA551566 AA330159 AJ922855 AA383512 AA029603 082246 D82171 794933 K56545 AA348060 
AA176888 R98764 AW451817 AA385766 AA452618 AI690057 AA988822 BE549928 AA150901 W57992 AW899925 C05281 
AA932042 AA370980 AW962877 W04741 AA369982 AW385948 AA922466 N75882 AI422070 AJ361256 AJ680224 D57122 T94885 
R53266 R46713 T19071 AW796277 AA325333 F04719 FQ2334 AA358146 AA626597 AA358304 AW028099 AL1 19570 D57290 
D58273 D57796 N48555 Ai3S1969 AA329457 D57225 AWQ24046 AA992606 AWQ221 16 AW021538 AA935845 H8Z870 K55546 
AW961219 AA453239 AW837541 N45521 BE218029 AA318877 AA327740 AW961 809 T92139 D53216 D52365 D53363 D53312 
D531 16 A1547267 AA679935 AW026552 AWQ26418 AW1 90507 A1927710 AW244108 050948 AW054991 AW021063 AWQ2251 1 
AA493436 AI365636 BE464751 AW 149384 AA102442 AW771368 AI818251 AI126368 051049 AI421542 A1559467 AW079779 
AW021048 AW023969 AW044214 AI458264 AA027274 AI620254 AW028917 BE21951 1 AA326242 N67561 AJ971273 AA878328 
057131 AA770662 A1309299 AI796767 AA613338 W58078 AI586287 AI445573 A1880260 AA001 91 9 AW339259 A1492610 AI49261 1 
R97692 AI301425 AA722603 D58361 AI350323 AA973926 AI431263 AA516126 AA865467 A1925177 N39443 AA001943 A1299371 
A1082412 AA665O90 AA583433 H89871 AA977231 AI362219 AIQ56096 AI270448 N87524 N22103 AWB14224 AA744054 AW243622 
At61 3188 AI929173 AI350243 AI3621 38 AA744004 AA176661 D56787 AI955625 AI393109 AI094769 A1479726 AI423107 A1955617 
AI034036 A1582196 AW264534 AJ418961 AA570761 A1343538 AA650341 AA992503 AA770004 AL039B68 AI862675 AW1 90335 
AA610274 AW41 8827 BE467472 056786 728749 AB17610 AI359556 T23523 AL040189 AA846222 AA651636 D51280 A1888986 
AI521167 A1340177 AW612815 A1525285 AA621607 AA177059 AA229768 AA829788 AI749682 AW190631 N75299 AA230089 
AI915632 BE069542 AA890020 AA528397 AA995390 BE503660 AA570812 AW339396 AI197986 AI203725 AI282379 AA670375 
AA461513 F01728 AW243599 C00856 N75567 R95995 AA150932H95961 AA648060 AA933800 AA927073 AA101 126 AA864190 
T93566BE167472 

AF030880 NM.000441 AC002467 AA385554 H23053 AW891838 Al 139968 AA653057 A1695233 
AA527941 AI81Q608 AI620190 AA635266 

ABQ28945 T77648 F13328 AL 157605 Z46212 AA304736 F11855 T66098 T30174 AW954164 AW 176301 AW748243 AA456428 
A1369958 AA938565 AW959613 Z42008 AA994779 Al 683909 F11019 F10926 AI769597 AI752550 T65015 AI884314 AA643954 
Z41838 AW020147 AI038822 AW571822 AA299781 AA894928 AF131790 BE005411 AJ9Q2476 AW082695 AA464384 R42750 
AW9Q2301 AA464273 R05837 Z38294 H41098 AL134507 M86079 

AF035269 AF035268 NM_015900 T96213 U37591 AA156832 AA299371 AI084325 H95977 AI7B5967 BE221465 AA156726 A1969563 
AW024539 AI436791 AI949451 AA843093 AI452756 AAB24232 AI306667 T96131 AW207447 AW243556 AW957032 AI084332 
H95978 U30998 

NKL014253 AF1 00772 BE088769 AL022718 BE161779 AW863569 BE161640 AL039060 BE168542 AW296554 AA323193 AA235370 
AW779760 N48674 AI375997 R45432 059344 AI203107 F07491 R35360 R25094 AI913631 AI498402 T61382 A1016320 N45526 
T61415AA331486 

A1922988 K05475 AA021608 AW169947 AA913750 Z41614 AW800012 
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TABLE 15B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in Table 15. For each predicted exon, we have listed the genomic 
sequence source used for prediction. Nucleotide locations of each predicted exon are also 
5 listed. 



10 



Pkey: 
Ret 



Strand: 
NLposifion: 



Unique number corresponding to an Eos probeset 

Sequence source. The 7 digit numbers in this column are Genbank identifier (Gl) numbers. "Dunham I. et al * refers to the 
publication entitled The DNA 

sequence of human chromosome 22." Dunham I. et al, Nature (1999) 402:489*495. 
Indicates DNA strand from which exons were predicted, 
indicates nucleotide positions of predicted exons. 



IS Pkey Ref 



20 



334447 Dunham, I. etaL 

332798 Dunham,!. etaL 

338255 Dunham, I. eta! 

330211 6013592 

401424 8176894 



Strand 

Plus 

Minus 

Minus 

Phis 

Plus 



NLposftion 

14308764-14308824 
232147-231974 
15242294-15242231 
59158-59215 
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TABLE 11 AND SEQUENCE LISTING 

SEQtDN0:1 BCU4 ONA SEQUENCE 

Nucleic Acid Accession #: NM.Q24915 

Coding sequence: 13-1890 (undertned sequences correspond to start and stop codons) 
I 11 21 31 41 51 

#l I I 1 1 I _ 
ATTGGATCAA ACAXgTCACA AGAGTCGGAC AATAATAAAA GACTAGTGGC CTTAGTGCCC 60 

ATGCCCAGTG ACCCTCCATT CAATACCCGA AGAGCCTACA CCAGTGAGGA TGAAGCCTGG 120 

AAGTCATACT TGGAGAATCC CCTGACAGCA GCCACCAAGG CCATGATGAT CATTAATGGT 180 

GATGAGGACA GTGCTGCTGC CCTCGGCCTG CTCTATGACT ACTACAAGGT TCCTCGAGAC 240 

AAGAGGCTGC TGTCTGTAAG CAAAGCAAGT GACAGCCAAG AAGACCAGGA GAAAAGAAAC 300 

TGCCTTGGCA CCAGTGAAGC CCAGAGTAAT TTGAGTGGAG GAGAAAACCG AGTGCAAGTC 360 

CTAAAGACTG TTCCAGTGAA CCnTCCCTA AATCAAGATC ACCTGGAGAA TTCCAAGCGG 420 

G AACAGTACA GCATCAGCTT CCCCG AGAGC TCTGCCATCA TOOOGGTGTC GGGAATCACG 480 

GTGGTGAAAG CTGAAGATTT CACAOCAGTT TTCATGGOOC CACCTGTGCA CTATOCCCGG 540 

GGAGATGGGG AAGAGCAACG AOTGGTTATC TTTGAACAGA CTCAGTATGA CGTGOCCTCG 600 

CTGGCCACCC ACAGOGCCTA TCTCAAAGAC G ACCAGCGCA GCACTCCGGA CAGCACATAC 660 

AGCGAG AGCT TCAAGG ACGC AGCCACAG AG AAATTTCGGA GTGCTTCAGT TGGGGCTGAG 720 

GAGTACATGT ATGATCAGAC ATCAAGTGGC ACATTTCAGT ACACCCTGGA AGCCACCAAA 780 

TCTCTCCGTC AGAAGCAGGG GGAGGGCCCC ATGACCTACC TCAACAAAGG ACAGTTCTAT 840 

GCCATAACAC TCAGCGAGAC CGGAGACAAC AAATGCTTCC GACACCCCAT CAGCAAAGTC 900 

AGGAGTGTGG TCATGGTGGT CTTCAGTGAA GACAAAAACA GAG ATGAACA GCTCAAATAC 960 

TGGAAATACT GGCACTCTCG GCAGCATACG GCGAAGCAGA GGGTCCTTGA CATTGCCGAT 1020 

TACAAGGAGA GCTTTAATAC GATTGGAAAC ATTGAAGAGA TTGCATATAA TGCTGTTTCC 1080 

TTTACCTGGG ACGTGAATGA AGAGGOGAAG ATTTTCATCA CCGTGAATTG CTTGAGCACA 1 140 

GATTTCTCCT OCCAAAAAGG GGTCAAAGGA CTTOCTTTGA TG ATTCAGAT TGACACATAC 1200 

AGTTATAACA ATCGTAGCAA TAAACC3CATT CATAGAGCTT ATTGCCAGAT CAAGGTCTTC 1260 

TGTGACAAAG GAGCAGAAAG AAAAATCCGA GATGAAGAGC AGAAGCAGAA CAGGAAGAAC 1320 

GGGAAAGGCC AGGCCTCOCA AACTCAATGC AACAGC7CCT CTGATGGGAA GTTGGCTGOC 1380 

ATACCTTTAC AGAAGAAGAG TGACATCACC TACITCAAAA CCATGCCTGA TCTCCACTCA 1440 

CAGOCAGTTC TCTTCATACC TGATGTTCAC TXTGCAAACC TGCAGAGGAC CGGACAGGTG 1500 

TATTACAACA CGGATGATGA ACGAGAAGGT GGCAGTGTCC TTGTTAAACG GATGTTCCGG 1560 

CCCATGGAAG AGGAGTTTOG TCCGGTGCCT TCAAAGCAGA TGAAAGAAGA AGGGACAAAG 1620 

CGAGTGCTCT TGTACGTGAG GAAGGAGACT GACGATGTGT TCGATGCATT GATGTTGAAG 1680 

TCTCCCACAG TGATGGGCCT GATGGAAGCG ATATCTGAGA AATATGGGCT GCCCGTGGAG 1740 

AAGATAGCAA AGCTTTACAA G AAAAGCAAA AAAGGCATCT TGGTGAACAT GGATGACAAC 1800 

ATCATCGAGC ACTACTCGAA CGAGGACAOC TTCATCCTCA ACATCGAGAG CATGGTGGAG 1860 

GGCTTCAAGG TCACGCTCAT GGAAATCIAQ CCCTGGGTTr GGCATCCGCT TTGGCTGGAG 1920 

CTCTCAGTGC GTTCCTCCCT GAGAGAGACA GAAGCCCCAG COCCAGAACC TGGAGACCCA 1980 

TCTCOCGCAT CTCACAACTG CTGTTACAAG AGCGTGCTGG GGAGTGGGGC AAGGGACAGG 2040 

CCCCACAGTC GGTGTGCTTG GCCCATCCAC TGGCACCTAC CACGGAGCCG AAGCCTGAGC 2100 

CCCTCAGGAA GGTGCCTTAG GCCTGTTGGA TTCCTATTTA TTGCCCACCTTTrcCTGGAG 2160 

CCCAGGTCCA GGCCCGCCAG GACTCTGCAG GTCACTGCTA GCTOCAGATG AGACCGTCCA 2220 

GCGTTCCCCC TTCAAGAGAA ACACTCATCC CGAACAGCCT AAAAAATTCC CATCCC1 1C T 2280 

TTCTCACCCC TCCATATCTA TATCTOCCGA GTGGCTCG AC AAAATGAGCT AOGTCTGGGT 2340 

GCAGTAGTTA TAGGTGGGGC AAGAGGTGGA TGCCCACTTT CTGGTCAGAC ACCTTTAGGT 2400 

TGCTCTGGGG AAGGCTGTCT TGCTAAATAC CTCCAGGGTT COCAGCAAGT GGOCAOCAGG 2460 

CCTTGTACAG GAAGACATTC AGTCACCGTG TAATTAGTAA CACAGAAAGT CTGOCTGTCT 2520 

GCATTGTACA TAGTGTTTAT AATATTGTAA TAATATATTT TACCTGTGGT ATGTGGGCAT 2580 

GTTTACTGCC ACTGGCCTAG AGGAGACACA GACCTGGAGA COGTTTTAAT GGGGGTTTTT 2640 

GCCTCTGTGC CTGTTCAAGA GACTTGCAGG GCTAGGTAGA GGGOCTTTGG GATGTTAAGG 2700 

TGACTGCAGC 1X3 ATGOCAAG ATGGACTCTG CAATGGGCAT ACCTGGGGGC TDGTTCCCTG 2760 

TCCCCAG AGG AAGCCOCCTC TCCITCTCCA TGGGCATGAC TCTCCTTCGA GGCCAOCAOG 2820 

TTTATCTCAC AATGATGTGT I I 1GCCTGAC TTTCCCTTTG CGCTGTCTCG TGGGAAAGGT 2880 

CATTCTGTCT GAGACCCCAG C1CC1 IC1CC AGCTTTCGCT GCGGGCATGG CCTGAGCTTT 2940 

CTGGAGAGCC TCTGCAGGGG GTTTGCCATC AGGGCCCTGT GGCTGGGTCT GCTGCAGAGC 3000 

TCCTTGGCTA TCAGGAGAAT CCTGGACACT GTACTGTGCC TCCCAGTTTA CAAACACGCC 3060 

CTTCATCTCA AGTGGCCCTT TAAAAGGCCT GCTGCCATGT GAGAGCTGTG AACAGCTCAG 3120 

CTCTGAGTOG GCAGACTGGG GCTTCCTCCT GGGCCACCAG ATGGAAAGGG GGTATTGTTT 3180 

GCCTCACTCC TGGATGCTGC GTTTTAAGGA AGTGAGTGAG AAAGAATGTG CCAAGATACC 3240 

TGGCTCCTGT GAAAOCAGCC TCAGGAGGGA AACTGGGAGA GAGAAGCTGT GGTCTCCTGC 3300 

TACATGCCCT GGGAGCTGGA AGAGAAAAAC ACTCCCCTAA ACAATCGCAA AATGATGAAC 3360 

CATCATGGGC CACTGTTCTC TTTGAGGGGA CAGGTTTAGG GGTTTGCGTT CGCCCTTGTG 3420 

GGCTGAAGCA CTAGCTTTTT GGTAGCTAGA CACATCCTGC ACCCAAAGGT TCTCTACAAA 3480 

GGCCCAGATT TGTTTGTAAA GCACTTTGAC TCTTACCTGG AGGCCCGCTC TCTAAGGGCT 3540 

TCCTGCGCTC CCACCTCATC TGTCCCTGAG ATGCAGAGCA GGATGGAGGG TCTGCTTCTA 3600 

GCTCAGCTGT TTCTCCTTGA GGTTGCGGAG GAATTGAATT GAATGGG ACA GAGGGCAGGT 3660 

GCTGTGGCCA AG AAGATCTC CGAGCAGCAG TGACGGGGCA OCTTGCTGTG TGTCCTCTGG 3720 

GCATGTTAAC CCTTCTGTGG GGCCAAAGGT TTGCATCGTG GATGCAGCTG TGCTCCAGTC 3780 

TGTCCCCTCC TCCTCCACTC TGACTGCCAC GCCCCGGACC AGCAGCTTGG GGACCCTCCA 3840 

GGGTACTAAT GGGGCTCTGT TCTGAGATGG ACAAATTCAG TGTTGGAAAT ACATGTTGTA 3900 

CTATGCACTT CCCATGCTXC TAGGGTTAGG AATAGTTTCA AACATGATTG GCAGACATAA 3960 

CAACGGCAAA TACTCGOACT GGGGCATAGG ACTCCAGAGT AGGAAAAAGA CAAAAGATTT 4020 

GGCAGCCTGA CACAGGCAAC CTACCCCTCT CTCTCCAGCC TCTTTATGAA ACTGTTTGTT 4080 

TGOCAGTCCT GCCCTAAGGC AGAAGATGAA TTGAAGATGC TGTGCATGTT TCCTAAGTCC 4140 

TTGAGCAATC ATGGTGGTGA CAATTGCCAC AAGGGATATG AGGCCAGTGC CACCAGAGGG 4200 
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TGGTGCCAAG TGCCACATCC CTTCCG ATCC ATTOCCCTCT GTATCCTCOG AGCACCCCAG 4260 
TTTGCCTTTG ATGTGTCCGC TGTGTATGTT AGCTGAACTT TGATG AGCAA AATTTCCTGA 4320 
GCGAAACACT CCAAAGAG AT AGG AAAACTT GCCGCCTCTT CTTTTTTGTC CCTTAATCAA 4380 
ACTCAAATAA GCTTAA AAAA AATCCATGG A AGATCATGGA CATGTGAAAT GAGCATTTTT 4440 
TTCTTTTCTTTn i 1 11 11 1 i ITTTI iA AC AAAGTCTGAA CTGAACAGAA CAAOACTTTT 4500 
TCCTCATACA TCTCCAAATT GTTTAAACTT ACTTTATGAG TGTTTGTTTA GAAGTTCGGA 4560 
CCAACAGAAA AATGCAGTCA GATGTCATCT TGGAATTGGT TTCTAAAAGA GTAAGGCATG 4620 
TCOCTGCCCA GAAACTTAGG AAGCATG AAA TAAATCAAAT GTTTATTTTC CTTCTTATTT 4680 
AAAATCATGC TAATGCAACA GAAATAGAGG GTTTGTGCCA AATGCTATGA ACGGCCCTTT 4740 
CTTAAAGACA AGCAAGGGAG ATTGATATAT GTACAATTTG CTCTCATGTT TTT 



SEQIDNMpgU4,Pr9tgingWW"W; 
PiolelnAaeslonft NP.079191.1 

1 U 21 31 41 51 
I I I I i I 

MSQESDNNKR LVALVPMPSD PPFNTRRAYT SEDEAWKSYL ENPLTAATKA MMDNGDEDS 60 
AAALGLLYDY YKVPRDKRLL SVSKASDSQE DQEKRNCLGT SEAQSNLSGG ENRVQVLKTV 120 
PVNLSLNQDH LENSKREQYS 1SFPESSAII PVSGITWKA EDFTPVFMAPFVHYPRGDGE 180 
BQRWBFEQT QYDVPSLATH S AYLKDDQRS TPDSTYSESF KDAATEKFRS ASVGAEEYMY 240 
DQTSSGTFQY TLEATKSLRQ KQGBGPMTYL NKGQFYAITL SETGDNKCFR HPEKVRS W 300 
MWFSEDKNR DEQLKYWKYW HSRQHTAKQR VLDIADYKES FMTIGNIEEI AYNAVSFTWD 360 
VNEEAKmT VNCLSTDFSS QKGVKGLPLM IQIDTYSYNN RSNKPIHRAY CQKVFOJKG 420 
AERKRDEEQ KQNRKNGKGQ ASQTQCNSSS DGKLAAIPLQ KKSDITYFKT MPDLHSQPVL 480, 
FIPDVHFANL QRTGQVYYNT DDEREGGSVL VKRMFRPMEE EFGPVPSKQM KEEGTKRVIX 540 
YVRKETDDVF DALMLKSPTV MGLMEA1SEK YGLPVEKIAK LYKKSKKGIL VNMDDNHEH 600 
YSNEDTHLN MESMVEGFKV TLMH 



SEQ ID Nfc3 BCU7 DNA SEQUENCE VARIANT 1: 

Nucleic Add Accession #: AA428082 

Cooing sequence: 1-777 (entire sequaxe represents open reading frame) 



1 11 21 31 41 51 
t I I I I I 

ATGATAGCAA TCTCTCCCGT CAGCAGTGCA CTCCTGTTCT CCCTTCTCTG TGAAGCAAGT 60 

ACCGTCGTCC TACTCAATTC CACTGACTCA TCCCCGCCAA CCAATAATTT CACTGATATT 120 

6AAGCAGCTC TGAAAGCACA ATTAGATTCA GCGGATATCC CCAAAGCCAG GCGGAAGCGC 180 

TACATT TCGC AGAATGACAT GATCGCCATT CTTGATTATC ATAATCAAGT TCGGGGCAAA 240 

GTGTTCCCAC CGGCAGCAAA TATGGAATAT ATGGTTTGGG ATGAAAATCT TGCAAAATCG 300 

GCAGAGGCTT GGGCGGCTAC TTGCATTTGG GACCATGGAC CTTCTTACTT ACTGAGATTT 360 

TTGGGCCAAA ATCTATCTGT ACGCACTGGA AGATATCGCT CTATTCTCCA GTTGGTCAAG 420 

CCATGGTATG ATGAAGTGAA AGATTATGCT TTTCCATATC CCCAGGATTG CAACCCCAGA 480 

TGTCCTATGA GATGTTTTGG TCCCATGTGC ACACATTATA CGCAGATGGT TTGGGCCACT 540 

TCCAATCGGA TAGGATGCGC AATTCATGCT TGCCAAAACA TGAATGTTTG GGGATCTGTG 600 

TGGCGACGTG CAGTTTACTT GGTATGCAAC TATGCCCCAA AGGGCAATTG GATTGGAGAA 660 

GCACCATATA AAGTAGGGGT ACCATGTTCA TCTTGTCCTC CAAGTTATGG GGGATCTTGT 720 
ACTGACAATC TGTGTTTTCC AGGAGTTACG TCAAACTACC TGTACTGGTT TAAATAA 

SEQ ID HO:4 BCV7 QUA SEQUENCE VARIANT Z 

Nucleic Add Accession*: AA42S062 

Coding sequence: 1 -777 (entire sequence represents open reading frame) 



1 11 21 31 41 51 

I I I J I I 

ATGATAGCAA TCTCTCCCGT CAGCAGTGCA CTCCTGTTCT CCCTTCTCTG TGAAGCAAGT 60 

ACCGTCGTCC TACTCAATTC CACTGACTCA TCCCCGCCAA CCAATAATTT CACTGATATT 120 

GAAGCAGCTC TGAAAGCACA ATTAGATTCA GCGGATATCC CCAAAGCCAG GCGGAAGCGC 180 

TACATTTCGC AGAATGACAT GATCGCCATT CTTGATTATC ATAATCAAGT TCGGGGCAAA 240 

GTGTTCCCAC CGGCAGCAAA TATGGAATAT ATGGTTTGGG ATGAAAATCT TGCAAAATCG 300 

GCAGAGGCTT GGGCGGCTAC TTGCATTTGG GACCATGGAC CTTCTTACTT ACTGAGATTT 360 

TTGGGCCAAA ATCTATCTGT ACGCACTGGA AGATATCGCT CTATTCTCCA GTTGGTCAAG 420 

CCATGGTATG ATGAAGTGAA AGATTATGCT TTTCCATATC CCCAGGATTG CAACCCCAGA 480 

TGTCCTATGA GATGTTTTGG TCCCATGTGC ACACATTATA CGCAGATGGT TTGGGCCACT 540 

TCCAATCGGA TAGGATGCGC AATTCATACT TGCCAAAACA TGAATGTTTG GGGATCTGTG 600 

TGGCGACGTG CAGTTTACTT GGTATGCAAC TATGCCCCAA AGGGCAATTG GATTGGAGAA 660 

GCACCATATA AAGTAGGGGT ACCATGTTCA TCTTGTCCTC CAAGTTATGG GGGATCTTGT 720 
ACTGACAATC T GT GTT T TCC AGGAGTTACG TCAAACTACC TGTACTGGTT TAAATAA 

seq to uoxsmfmimmsMmi 

Protein Accession*: none 

1 11 21 31 41 51 

I I I I I I 

MXAISAVSSA LLFSLLCEAS TWLLNSTDS SPPTNNFTDI EAALKAQLDS ADIPKARRKR 60 
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YISQNDMIAI LDYHNQVRGK VFPPAANMEY MVWDENLAKS AEAWAATCIW DHGPSYLLRF 120 

LGQNLSVRTG RYRSILQLVK PWYDEVKDYA FPYPQDCNPR CPHRCFGPMC THYTQMVWAT 180 

SNRIGCAIHA CQNMNVWGSV WRRAVYLVCN YAPKGNWIGE APYKVGVPCS SCPPSYGGSC 240 
TDNLCFPGVT SNYLYWFK 

SEQ ID NO:6 BCU7 Protein sequence Vartant 2: 
Protein Accession #: none 



1 11 ' 21 31 41 51 

I I I I I I 

MIAISAVSSA LLFSLLCEAS TWLLNSTDS SPPTNNFTDI EAALKAQLDS ADIPKARRKR 60 
YISQNDMIAI LDYHNQVRGK VFPPAANMEY MVWDENLAKS AEAWAATCIW DHGPSYLLRF 120 
LGQNLSVRTG RYRSILQLVK PWYDEVKDYA FPYPQDCNPR CPMRCFCPMC THYTQMVWAT 180 
SNRIGCAIHT CQNMNVWGSV WRRAVYLVCN YAPKGNWIGE APYKVGVPCS SCPPSYGGSC 240 
TDNLCFPGVT SNYLYWFK 

SEQ ID N0:7 BCX2 ONA SEQUENCE 

Nucleic Acid Accession t: NMJJQ3014 

Coding sequence: 238-1278 (underlined sequences correspond to start and stop coders) 

1 11 21 31 41 51 
I I I I I I 

GGCGGGTTOG CGCCCCG AAG GCTGAGACCT GGCGCTGCTC GTGCOCTQTG TGCCAGACXK3 60 
CGGAGCTCCG CGGCCGGACC CCGCGGCCCC GCTTTGCTGC CGACTGGAGT TTGGGGGAAG 120 
AAACTCTCCT GCGCCCCAGA AGATTTCTTC CTCGGCGAAG GGACAGCGAA AGATGAGGGT 180 
GGCAGGAAGA GAAGGCGCTT TCTGTCTGCC GGGGTCGCAG CGCGAGAGGG CAGTGCCATG. 240 
TTCCTClCCA TCCTAGTGGC GCTGTGOCTG TGGCTGCAOC TGGGGCTGGG CGTGOGCGGC 300 
GCGCCCTGCG AGGCGGTGCG CATCCCTATG TGCCGGCACA TGCCCTGG AA CATCACGCGG 360 
ATGCOCAACC ACCTGCACCA CAGCACGCAG GAGAACGCCA TCCTGGCCAT CGAGCAGTAC 420 
GAGGAGCTGG TGGACGTGAA CTGCAGCGCC GTGCTGCGCT I CI IC T T C IG TGCCATGTAC 480 
GCGOCCATTT GCACCCTGGA GTTCCTGCAC GACCCTATCA AGCCGTGCAA GTCGCTGTGC 540 
CAACGCGCGC GCGACGACTG CGAGCCCCTC ATGAAGATGT ACAACCACAG CTGGCCCGAA 600 
AGOCTGGCCT GCGACGAGCT GCOGTCTAT GAGCGTGGCG TGTGCATTTC GCCTGAAGCC 660 
ATCGTCACGG AGCTGCOGGA GGATGTTAAG TGGATAG ACA TCACACCAGA CATGATGGTA 720 
CAGGAAAGGC CTCTTGATGT TG ACTGTAAA CGCCTAAGCC CCG ATCGGTG CAAGTGTAAA 780 
AAGGTGAAGC CAACTTTGGC AACGTATCTC AGCAAAAACT ACAGCTATGT TATTCATGCC 840 
AAAATAAAAG CTGTGCAGAG GAGTGGCTGC AATGAGGTCA CAACGGTGGT GGATGTAAAA 900 
GAGATCTTCA AGTCCTCATC ACCCATCCCT CGAACTCAAG TCCCGCTCAT TACAAATTCT 960 
TCTTGCCAGT GTOCACACAT CCTGCCCCAT CAAGATGTTC TCATCATGTG TTACGAGTGG 1020 
CGTTCAAGGA TGATGCTTCT TGAAAATTGC TTAGTTGAAA AATGGAGAGA TCAGCTTAGT 1080 
AAAAGATCCA TACAGTGGGA AGAGAGGCTG CAGGAACAGC GGAGAACAGT TCAGGACAAG 1140 
AAGAAAACAG CCGGGCGCAC CAGTCGTAGT AATCCCCCCA AACCAAAGGG AAAGCCTCCT 1200 
GCTGCCAAAC CAGCCAGTCC CAAGAAGAAC ATTAAAACTA GGAGTGCCCA GAAGAGAACA 1260 
AACCCGAAAA GAGTGTGAGC TAACTAGTTT CCAAAGCGGA GACTTCCGAC TTOCTTACAG 1320 
GATG AGGCTG GGCATTGCCT GGGACAGOCT ATGTAAGGCC ATGTGCOCCT TGCCCTAACA 1380 
ACTCACTGCA GTGCTCTTCA TAGACACATC TTGCAGCATT TTTCTTAAGG CTATGGTTCA 1440 
GTTTTTCTTT GTAAGCCATC ACAAGOCATA GTGGTAGGTT TGCCCTTTGG TACAGAAGGT 1500 
GAGTTAAAGC TGGTGG AAAA GGCTTATTGC ATTGCATTCA GAGTAACCTG TGTGCATACT 1560 
CTAGAAGAGT AGGGAAAATA ATGCTTGTTA CAATTCGACX TAATATGTGC ATTGTAAAAT 1620 
AAATGOCATA TTTCAAACAA AACACGTAAT TTTTTTACAG TATGTTTTAT TACCTTTTGA 1680 
TATCTGTTGT TGCAATGTTA GTGATGTTTT AAAATCTGAT GAAAATATAA TGTTTTTAAG 1740 
AAGGAACAGT AGTGGAATGA ATGTTAAAAG ATCTTTATGT GTTTATGGTC TGCAGAAGGA 1800 
TTTTTGTGAT GAAAGGGGAT TTTTTGAAAA ATTAGAGAAG TAGCATATGG AAAATTATAA 1860 
TOTOTTTm TAOCAATGAC TTCAGTTTCT GTTTTTAGCT AGAAACTTAA AAACAAAAAT 1920 
AATAAT AAAG AAAAATAAAT AAAAAGG AGA GGCAGACAAT GTCTGG ATIC CTGTTTTTTG 1980 
GTTACCTG AT TTOCATGATC ATGATGCTTC TTGTCAACAC CCTCTTAAGC AGCACCAGAA 2040 
ACAGTGAGTT TGTCI G TACC ATTAGGAGTT AGGTACTAAT TAGTTGGCTA ATGCTCAAGT 2100 
ATTTTATACC CACAAGAGAG GTATGTCACT CATCTTACTT CCCAGGACAT CCACCCTGAG 2160 
AATAATTTGA CAAGCTTAAA AATGGCCTTC ATGTGAGTGC CAAATTTTGT TTTTCTTCAT 2220 
TTAAATATTT TCTTTGCCTA AATACATGTG AGAGGAGTTA AATATAAATG TACAQAGAGG 2280 
AAAGTTGAGT TCCACCTCTG AAATG AG AAT TACTTG ACAG TTGGGATACT TTAATCAGAA 2340 
AAAA AGAA CT TATTTGCAGC ATTTTATCAA CAAATTTCAT AATTGTGGAC AATTGGAGGC 2400 
ATTTATTTTA AAAAACAATT TTATTGGCCT 1TTGCTAACA CAGTAAGCAT GTATTTTATA 2460 
AGGCATTCAA TAAATGCACA ACGCCCAAAG GAAATAAAAT CCTATCTAAT CCTACTCTCC 2520 
ACTACACAGA GGTAATCACT ATTAGTATTT TGGCATATTA TTCTCCAGGT GTTTGCTTAT 2580 
GCACTTATAA AATGATTTGA ACAAATAAAA CTAGGAACCT GTATACATGT GTTTCATAAC 2640 
CTGGCTOCTT TGCTTGGCCC TTTATTGAGA TAAGTTTTCC TGTCAAGAAA GCAGAAACCA 2700 
TCTCATTTCT AACAGCTGTG TTATATTCCA TAGTATGCAT TACTCAACAA ACTGTTGTGC 2760 
TATTGGATAC TTAGGTGGTT TCTTCACTGA CAATACTGAA TAAACATCTC ACCGGAATTC 



SEQ 10 N0;8 BCX2 Protein sequence: 
Protein Accession * NP.003005.1 

1 11 21 31 41 51 
I I I I I I 

MFLSILVALC LWLHLALGVR GAPCEAVRIP MCRHMPWNIT RMPNHLHHST QENAILAIEQ 60 
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YEELVDVNCS A VLRFFFCAM YAPICTLEFL HDPIKPCKSV CQRARDDCEP LMKMYNHSWP 120 
ESUCDELPV YDRGVCISPE AIVTDLPEDV KWIDITPDMM VQERPLDVDC KRLSPDRCKC 180 
KKVKPTLATY LSKNYSYVIH AKIKA VQRSG CNEVTTWDV KBIFKSSSP1 PRTQVPUTN 240 
SSCQCPHILP HQD VLD4CYE WRSRMMLLEN CLVEKWRDQL SKR5IQWEER LQEQRRTVQD 300 
KKKTAGRTSR 5NPPKPKGKP PAPKPASPKK NIKTRSAQKR TNPKRV 

8EQ ID N0:9 C8K1 DNA SEQUENCE 

Nuddc Acid Accession #: NM.032391 

Coding sequence: 129-302 (underlined sequences correspond to start and stop cottons) 



1 11 21 31 41 51 

I i I I I I 

GTCCTTCCTC TCCTAGCCTA AGGCGTGCAA ACAGAGCGCC ACTGGGAGGC TGAAACCTTT 60 

AGGCCGATGC TTGCTTGCAA GGTCAGGCAA GCTGGATTCT GGTCCCCACC TTTGCAGAGA 120 

GAACAGC GA'T GTTGTGCGCC CATTTCTCAG ATCAAGGACC GGCCCATCTT ACTAOCTCCA 180 

AGAGTGCTTT TCTCTCTAAT AAGAAAACAT CTACTTTGAA ACATCTACT6 GGCGAGACCA 240 

GGAGTGATGG CTCAGCCTGT AATTCTGGAA TTTCGGGAGG CCGA6GCAGG AAGATTCCTT 300 

GAGCACAGGA GTTCCAGACC AGCCTGGGCA ATGTAGCAAG ACGCTQTCTC TATTTATACA 360 
ATAAAATTTT TTTAAAAAAG G 



SEQ tD KthIO CBK1 Protein sequence: 
Protein Accession ft NPJ15767 



1 11 21 31 41 51 

I 1 I I I I 

MLCAHFSDQG PAHLTTSKSA FLSNKKTSTL KHLLGETRSD GSACNSGISG GRGRfQP 

SEQ ID N0:11 CHA1 DNA SEQUENCE 

Nucleic Add Accession ft NM_020182 

Coding sequence: 9&S54 (underlined sequences correspond to start and stop codons) 



l 11 21 

I I I 

TCCTTGGGTT CGGGTGAAAG C6CCTGGGGG 
AACTGAAGGC GGACAGTCTC CTGCGAAACC 
TCATCATCAT CGTGGTGGTG ATGATGGTGA 
ACTACAAGCT GTCTGCACGO TCCTTCATCA 
ATGCCCTGTC CTCAOAAGGA TGCCTGTGGC 
TCCCAGAGCC GCAGGTCTAC GCCCCGCCTC 
TCGCCCAGCG GGAGCGCTTC CACCGCTTCC 
TCGACCTGCC ACC CACCAT C TCGCTGTCAG 
CCTGCACCCT CCAGCTTCGG GACCCC6AGC 
GC6CA0CCCC AAACAGAACC ATCTTCGACA 
GCCOCT6CCC CCCCAGCAGT AACTCGGGCA 
GCATGGAGGG 6CCG0C6CCC ACCTACAGC6 
TCCAGCACCA GCAGA6CAGT GGGCCGCCCT 
CACACATCGC 6COCCTAGA6 A6CGCAGCCA 
GACACCCTCT CTAGGGTCCC CAG6GG66CC 
ACACTCCGCG CTTCTTAGAA GAGGAGTGAG 
GTGGCCCTCC OCTOCCACCT CCCTGTGTAT 
GCACAAGCTA AGAGAGCTTG CAAAAAAAAA 
TTTGTTGAGC TGTGTCTTGA AGGCAAAAGA 



31 


41 


51 




1 

TTCGTGGCCA 


1 

TGATCCCCGA 


1 

GCTGCTGGAG 


60 


AGGCAATGGC 


GGAGCTGGAG 


TTTGTTCAGA 


120 


TGGTGGTGGT 


GATCACGTGC 


CTGCTGAGCC 


180 


GCCGGCACAG 


CCAGGGGCGG 


AGGAGAGAAG 


240 


CCTCGGAGAG 


CACAGTGTCA 


GGCAACGGAA 


300 


GGCCCACCGA 


CCGCCTGGCC 


GTGCCGCCCT 


360 


AGCCCACCTA 


TCCGTACCTG 


CAGCACGAGA 


420 


ACGGGGAGGA 


GCCCCCACCC 


TACCAGGGCC 


480 


AGCAGCTGGA 


ACTGAACCGG 


GAGTCGGTGC 


540 


GTGACCTGAT 


GGATAGTGCC 


AGGCTGGGCG 


600 




GTGCTACGGC 




660 


AGGTCATCGG 


CCACTACCCG 


GGGTCCTCCT 


720 


CCTTGCTGGA 


GGGGACCCGG 


CTCCACCACA 


780 


TCTGGAGCAA 


AGAGAAGGAT 


AAACAGAAAG 


840 




TGCGTAGGTG 


AAAAGGCAGA 


900 


AGGAAGGCGG 


GGGGCGCAGC 


AAOGCATCGT 


960 


AAATATTTAC 


ATGTGATGTC 


TGGTCTGAAT 


1020 


AAGAAAAAAG 


AAAAAAAAAA 


ACCACGTTTC 


1080 


AAAAAAATTT 


CTACAGTAAA 




1140 



SEQ ID NOH2 CHA1 Protein sequence: 
Prolan Accession ft NPJB4567 

1 11 21 31 41 51 

I I I I I I 

KAELEFVQII IIWVMMVMV WITCLLSHY KLSARSP1SR HSQGRRREDA LSSBGCLWPS 60 

ESTVSGNGIP EPQVYAPPRP TDRLAVPPFA QRERFHRFQP TYPYLQHEID LPPTISLSDG 120 

EEPPPYQGPC TLQLRDPEQQ LELNRESVRA PPKRTXFDSD LMDSARLGGP CPPSSNSGIS 180 

ATCYGSGGRM EGPPPTYSEV IGHYPGSSFQ HQQSSGPPSL LEGTRLHHTH IAPLESAAIW 240 
SKEKDKQKGH PL 



SEQ {D Nfc13 CJA5 DNA SEQUENCE 

Nucleic Add Accession #: NM.012445 

CoolnB sequence: 276-1271 (underlined sequences correspond to start and stop codons) 



11 21 31 41 51 

I I I I 1 
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GCACGAGGGA AGAGGGTGAT CCGACCCGGG GAAGGTCGCT 6GGCAGGGCO AGTTGGGAAA 60 

GCGGCAGCCC CCGCCGCCCC CGCAGCCCCT TCTCCTCCTT TCTCCCACGT OCTATCTGCC 120 

TCTCGCTGGA GGCCAGGCCG TGCAGCATCO AAGACAGGAG GAACTGGAGC CTCATTGGCC 180 

GGCCCGGGGC GCCGGCCTCO GGCTTAAATA GGAGCTCCGG GCTCTQGCTO GGACCCGACC 240 

GCTGCCGGCC GCGCTCCCGC TGCTCCT60C GGGTGATGGA AAACCCCAGC CCGGCCGCCG 300 

CCCTGGGCAA GGCCCTCTGC GCTCTCCTCC TGGCCACTCT CGGCGCCGCC GGCCAGCCTC 360 

TTGGGGGAGA GTCCATCTGT TCCGCCAGAG CCCCGGCCAA ATACAGCATC ACCTTCACGG 420 

GCAAGTGGAG CCAGACGGCC TTCCCCAAGC AGTACCCCCT GTTCCGCCCC CCTGCGCAGT 480 

GGTCTTCGCT GCTGGGGGCC GCGCATAGCT CCGACTACAG CATGTGGAGG AAGAACCAGT 540 

ACGTCAGTAA CGGGCTGCGC GACTTTGCGG AGCGCGGCGA GGCCTGGGCG CTGATGAAGG 600 

AGATCGAGGC GGCGGGGGAG GOGCTGCAGA GCGTGCACGC GGTGTTTTCG GCGCCOGCCG 660 

TCCCCAGCGG CACCGGGCAG ACGTCGGCGG AGCTGGAGGT GCAGCGCAGG CACTCGCTGG 720 

TCTCGTTTGT GGTGCGCATC GTGCCCAGCC CCGACTGGTT CGTGGGCGTG GACAGCCTGG 780 

ACCTGTGCGA CGGGGACCGT TGGCGGGAAC AGGCGGCGCT GGACCTGTAC CCCTACGACG 840 

CCGGGACGGA CAGCGGCTTC ACCTTCTCCT CCCCCAACTT CGCCACCATC CCGCAGGACA 900 

CGGTGACCGA GATAACGTCC TCCTCTCCCA GCCACCCGGC CAACTCCTTC TACTACCCGC 960 

GGCTGAAGGC CCTGCCTCCC ATCGCCAGGG TGACACTGGT GCGGCTGCGA CAQAGCCCCA 1020 

GGGCCTTCAT CCCTCCCGCC CCAGTCCTCC CCAGCAGGGA CAATGAGATT GTAGACAGCG 1080 

CCTCAGTTCC AGAAACGCCG CTGGACTGCG AGGTCTCCCT GTGGTCGTCC TCGGGACTGT 1140 

GCGGAGGCCA CTGTGGGAGG CTCGGGACCA AGAGCAGGAC TCGCTACGTC CGGGTCCAGC 1200 

CCGCCAACAA OGGGAGCCCC TGCCCCGAGC TCGAAGAAGA GGCTGAGTGC GTCCCTGATA 1260 

ACTGCGT CTA AG ACCAGAGC CCCGCAGCCC CTGGGGCCCC CGGAGCCATG GGGTGTCGGG 1320 

GGCTCCTGTG CAGGCTCATG CTGCAGGCGG CCGAGGCACA GGGGGTTTCG CGCTGCTCCT 1380 

GACCGCGGTG AGGCCGCGCC . GACCATCTCT GCACTGAAGG GCCCTCTGGT GGCCGGCACG 1440 

GGCATTGGGA AACAGCCTCC TCCTTTCCCA ACCTTGCTTC TTAGGGGCCC CCGTGTCCCG 1500 

TCTGCTCTCA GCCTCCTCCT CCTGCAGGAT AAAGTCATCC CCAAGGCTCC AGCTACTCTA 1560 

AATTATGGTC TCCTTATAAG TTATTGCTGC TGCAGGAGAT TGTCCTTCAT CCTCCAGGGG 1620 

CCTGGCTCCC ACGTGGTTGC AGATACCTCA GACCTGGTGC TCTAGGCTGT GCTGAGCCCA 1680 

CTCTOCCGAG GGCGCATCCA AGCGGGGGCC ACTTGAGAAG TGAATAAATG GGGCGGTTTC 1740 

GGAAGCGTCA GTGTTTCCAT GTTATGGATC TCTCTGCGTT TGAATAAAGA CTATCTCTGT 1800 
TGCTCAC 



SEQPIKhlip^PW^WffTO; 
Protein Accession t: NP_036577 

1 11 21 31 41 51 

I I ' I I I I 

MENPSPAAAL GKALCALLLA TLGAAGQPLG GESICSARAP AKYSITFTGK WSQTAFFKQY 60 

PLFRPPAQWS SLLGAAHSSD YSMWRKNQYV SNGLRDFAER GEAWALMKEI EAAGEALQSV 120 

HAVFSAPAVP SGTGQTSAEL EVQRRHSLVS FWRTVPSPD WFVGVDSLDL CDGDRWREQA 180 

ALDLYPYBAG TDSGFTFSSP NFATIPQDTV TEITSSSPSH PANSFYYPRL KALPPIARVT 240 

LVRLRQSPRA FIPPAPVLPS RDNEIVDSAS VPETPLDCEV SLWSSWGLCG GHCGRLGTKS 300 
RTRYVRVQPA NNGSPCPELE EEAECVPDNC V 



SEQ ID NO:15 LBH9 DNA SEQUENCE 

Nudeic Add Accession ft NM_002391 

Coding sequence: 26-457 (underlined sequences correspond to start and stop codons) 
1 11 21 31 41 51 

I I I I I I 

CGGGCGAAGC AGCGCGGGCA GCGAGATGCA GCACCGAGGC TTCCTCCTCC TCACCCTCCT 60 

CGCCCTGCTG GCGCTCAOCT CCGCGGTCGC CAAAAAGAAA GATAAGGTGA AGAAGGGCGG 120 

CCCGGGGAGC GAGTGCGCTG AGTGGGCCTG GGGGCCCTGC ACCCCCAGCA GCAAGGATTG 180 

CGGCGTGGGT TTCCGCGAGG GCACCTGCGG GGCCCAGACC CAGCGCATCC GGTGCAGGGT 240 

GCCCTGCAAC TGGAAGAAGG AGTTTGGAGC CGACTGCAAG TACAAGTTTG AGAACTGGGG 300 

TGCGTGTGAT GGGGGCACAG GCACCAAAGT CCGCCAAGGC ACCCTGAAGA AGGCGCGCTA 360 

CAATGCTCAG TGCCAGGAGA CCATCCGCGT CACCAAGCCC TGCACCCCCA AGACCAAAGC 420 

AAAGGCCAAA GCCAAGAAAG GGAAGGGAAA GGACTAGACG CCAAGCCTGG ATGCCAAGGA 480 

GCCCCTGGTG TCACATGGGG CCTGGCCAOG CCCTCCCTCT CCCAGGCCCG AGATGTGACC 540 

CACCAGTGCC TTCTGTCTGC TCGTTAGCTT TAATCAATCA TGCCCTGCCT TGTCCCTCTC 600 

ACTCCOCAGC CCCACCCCTA AGTGCCCAAA GTGGGGAGGG ACAAGGGATT CTGGGAAGCT 660 

TGAGCCTCCC CCAAAGCAAT GTGAGTCCCA GAGCCCGCTT TTGTTCTTCC CCACAATTCC 720 

ATTACTAAGA AACACATCAA ATAAACTGAC TTTTTCCCCC CAATAAAAGC TCTTCTTTTT 780 
TAATAT 



SEQ ID NO:16LBH9 Protein sequence: 
Protein Accession ft. NPJW2302 

1 11 21 31 41 51 

i I I I I I 

MQHRGFLLLT LLALLALTSA VAKKKDKVKK GGPGSECAEW AWGPCTPSSK DCGVGFREGT 60 
CGAQTQRIRC RVPCNWKKEF GADCKYKPEN WGACDGGTGT KVRQGTLKKA RYNAQCQETI 120 
RVTKPCTPKT KAKAKAKKGK GKD 
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SEQ ID N0:17 LEM9 DNA SEQUENCE 

Nucleic Acid Accession t NM_005244 

Coding sequence: 1-1617 (underflned sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I II 

ATGQTAGAftC TAGTGATCTC ACCCAGCCTC ACTGTAAACA GCGATTGTCT GGATAAACTG 60 

AAGTTTAACC GTGCT6ACGC TGCTGTGTGG ACTCTGAQTG ACAGACAAGG CATCACCAAA 120 

TCGGCCCCCC TCAGAGTGTC CCAGCTCTTC TCCAGATCTT GCCCACGTGT CCTCCCCCGC 180 

CAGCCTTCCA CAOCCATGGC AGCCTACGGC CAGACGCAGT ACAGTGCGGG GATCCAGCAG 240 

GCTACCCOCT ATACAGCTTA CCCACCTCCA GCACAAGCCT ATGGAATCCC TTCCTACAGC 300 

ATCAAGACAG AAGACAGCTT GAACCATTCC CCTGGCCAGA GTGGATTCCT CAGCTATGGC 360 

TCCAGCTTCA CCACCTCACC CACTGGACAG AGCCCATACA CCTACCAGAT GCACGGCACA 420 

ACAGGGTTCT ATCAAGGAGG AAATGGACTG GGCAACGCAG CCGGTTTCGG GAGTGTGCAC 480 

CAGGACTATC CTTCCTACOC CGGCTTCCCC CAGAGCCAGT ACCCCCAGTA TTACGGCTCA 540 

TCCTACAACC CTCOCTACGT OOCGGCCAGC AGCATCTGCC CTTCGCCCCT CTCCACGTCC 600 

ACCTACGTCC TCCAGGAGGC ATCTCACAAC GTCCCCAACC AGAGTTCCGA GTCACTTGCT 660 

GGTGAATACA ACACACACAA TGGACCTTCC ACACCAGCGA AAGAGGGAGA CACAGACAGG 720 

CCGCACCGGG CCTCCGACGG GAAGCTCCGA GGCCGGTCTA AGAGGAGCAG TGACCCGTCC 780 

CCGGCAGGGG ACAATGAGAT TGAGCGTGTG TTCGTGTGGG ACTTGGATGA GACAATAATT 840 

ATTTTTCACT CCTTACTCAC GGGGACATTT GCATCCAGAT ACGGGAAGGA CAOCACGACG 900 

TCCGTGCGCA TTGGCCTTAT GATGGAAGAG ATGATCTTCA ACCTTGCAGA TACACATCTG 960 

TTCTTCAATG ACCTGGAGGA TTGTGACCAG ATCCACGTTG ATGACGTCTC ATCAGATGAC 1020 

AATGGCCAAG ATTTAAGCAC ATACAACTTC TCCGCTGACG GCTTCCACAG TTCGGCCCCA 1080 

GGAGCCAACC TGTGCCTGGG CTCTGGCGTG CACGGCGGCG TGGACTGGAT GAGGAAGCTG 1140 

GCCTTCCGCT ACCGGCGGGT GAAGGAGATG TACAATACCT ACAAGAACAA CGTTGGTGGG 1200 

TTGATAGGCA CTCCCAAAAG GGAGACCTGG CTACAGCTCC GAGCTGAGCT GGAAGCTCTC 1260 

ACAGACCTCT GGCTGACCCA CTCCCTGAAG GCACTAAACC TCATCAACTC CCGGCCCAAC 1320 

TGTGTCAATG TGCTGGTCAC CACCACTCAA CTAATTCCTG CCCTGGCCAA AGTCCTGCTA 1380 

TATGGCCTGG GGTCTGTGTT TCCTATTGAG AACATCTACA GTGCAACCAA GACAGGGAAG 1440 

GAGAGCTGCT TCGAGAGGAT AATGCAGAGA TTCGGCAGAA AAGCTGTCTA CGTGGTGATC 1500 

GGTGATGGTG TGGAAGAGGA GCAAGGAGCG AAAAAGCACA ACATGCCTTT CTGGCGGATA 1560 
TCCTGCCACG CAGACCTGGA GGCACTGAGG CACGCCCTGG AACTGQAGTA TTTATA^ 

SEQ P KO:18 LEM9 Protein seouence: 
Protein Accession*: NP.005235 



I 11 21 31 41 51 

II I I I I 

MVELVISPSL TVNSDCLDKL KFNRADAAVW TLSDRQGITK SAPLRVSQLF SRSCPRVLFR 60 

QPSTAMAAYG QTQYSAGIQQ ATPYTAYPPP AQAYGIPSYS HCTEDSLNHS PGQSGFLSYG 120 

SSFSTSPTGQ SPYTYQHHGT TGPYQGGNGL GNAAGFGSVH QEYPSYPGFP QSQYPQYYGS 180 

• SYNPPYVPAS SICPSPLSTS TYVLQEASHN VPNQSSESLA GBVITIHNGPS TPAKEGDTDR 240 

PHRASDGKLR GRSKRSSDPS PAGDNEIERV FVWDLDETII IFHSLLTGTF ASRYGKDTTT 300 

SVRIGLMMEE MIFNLADTHL FFNDLEDCDQ IHVDDVSSDD NGQDLSTYNF SADGFHSSAP 360 

GANLCLGSGV HGGVDWMRKL AFRYRKVKEM YNTYKNNVGG LIGTPKHETW LQLRAELEAL 420 

TDLWLTHSLK ALNLINSRFN CVNVLVTTTQ LIPALAKVLL YGLGSVFPIE NIYSATKTGK 480 
ESCFERIMQR FGRKAVYWT GDGVEEEQGA KKHNMPFWRI SCHADLEALR HALELEYL 



SEQ ID N0:19 0AA1 DKA SEQUENCE 

Nudeic Add Accession #: NMJD02740 

Coding sequence: 178-1968 (uwJerfined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

CCGCGGTTCC GGCTGCTCOG GCGAGGCGAC CCTTGGGTCG GCGCTGCGGG CGAGGTGGGC 60 

AGGTAGGTGG GCGGACGGCC GCGGTTCTCC GGCAAGCGCA GGCGGOGGAG TCCCCCACGG 120 

CGCCCGAAGC GCCCCCCGCA CCCCCGGCCT CCAGCGTTGA GGCGGGGGAG TGAGGAGATG 180 

CCGACCCAGA GGGACAGCAG CACCATGTCC CACACGGTCG CAGGCGGCGG CAGCGGGGAC 240 

CATTCCCACC AGGTCCGGGT GAAAGCCTAC TACCGCGGGG ATATCATGAT AACACATTTT 300 

GAACCTTCCA TCTCCTTTGA GGGCCTTTGC AATGAGGTTC GAGACATGTG TTCTTTTGAC 360 

AACGAACAGC TCTTCACCAT GAAATGGATA GATGAGGAAG GAGACCCGTG TACAGTATCA 420 

TCTCAGTTGG AGTTAGAAGA AGCCTTTAGA CTTTATGAGC TAAACAAGGA TTCTGAACTC 480 

TTGATTCATG TGTTCCCTTG TGTACCAGAA CGTCCTGGGA TGCCTTGTCC AGGAGAAGAT 540 

AAATOCATCT ACCGTAGAGG TGCACGCCGC TGGAGAAAGC TTTATTGTGC CAATGGCCAC 600 

ACTTTCCAAG CCAAGCGTTT CAACAGGCGT GCTCACTGTG CCATCTGCAC AGACCGAATA 660 

TGGGGACTTG GACGCCAAGG ATATAAGTGC ATCAACTGCA AACTCTTGGT TCATAAGAAG 720 

TGCCATAAAC TCGTCACAAT TGAATGTGGG CGGCATTCTT TGCCACAGGA ACCAGTGATG 780 

CCCATGGATC AGTCATCCAT GCATTCTGAC CATGCACAGA CAGTAATTCC ATATAATCCT 840 

TCAAGTCATG AGAGTTTGGA TCAAGTTGGT GAAGAAAAAG AGGCAATGAA CACCAGGGAA 900 

AGTGGCAAAG CTTCATCCAG TCTAGGTCTT CAGGATTTTG ATTTGCTGCG GGTAATAGGA 960 

AGAGGAAGTT ATGCCAAAGT ACTGTTGGTT CGATTAAAAA AAACAGATCG TATTTATGCA 1020 

ATGAAAGTTG TGAAAAAAGA GCTTGTTAAT GATGATGAGG ATATTGATTG GGTACAGACA 1080 

GAGAAGCATG TGTTTGAGCA GGCATCCAAT CATCCTTTCC TTGTTGGGCT GCATTCTTGC 1140 

TTTCAGACAG AAAGCAGATT gWriTOTT ATAGAGTATG TAAATGGAGG AGACCTAATG 1200 

TTTCATATGC AGCGACAAAG AAAACTTCCT GAAGAACATG CCAGATTTTA CTCTGCAGAA 1260 
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ATCAGTCTAS CATTAAATTA TCTTCATGAG 
GACAATGTAT TACTGGACTC TGAAGGCCAC 
GAAGGATTAC GCCCAGGAGA TACAACCAGC 
CCTGAAATTT TAAGAGGAGA AGATTATGGT 
CTCATGTTTG AGATGATGGC AGGAAGGTCT 
CCTGACCAGA ACACAGAGGA TTATCTCTTC 
CCACGTTCTC TGTCTGTAAA AGCTGCAAGT 
AAGOAACGAT TGGGTTGTCA TCCTCAAACA 
TTCCGAAATG TTGATTGGGA TATGATGGAG 
AATATTTCTG GGGAATTTGG TTTGGACAAC 
CAGCTCACTC CAGATGACGA TGACATTGTG 
TTTGAGTATA TCAATCCTCT TTTGATGTCT 
AACCATGTAT TCTACTCATG TTGCCATTTA 
TACAATTAAC CATTTTATAT TTGCCACCTA 
ACTATATGAA TCAATTATTA CATCTGTTTT 
TCCAGACAAT CATGTCAAAA TTTAGTTGAA 
ATGAGTAATG AAGTTACCTT TTTTGTTTAA 



CGACGGATAA TTTATAGAGA TTTGAAACTG 1320 

ATTAAACTCA CTGACTACGG CATGTGTAAG 1380 

ACTTTCTGTG GTACTCCTAA TTACATTGCT 1440 

TTCAGTGTTG ACTGGTGGGC TCTTGGAGTQ 1500 

CCATTTGATA TTGTTGGGAG CTCOGATAAC 1560 

CAAGTTATTT TGGAAAAACA AATTCGCATA 1620 

GTTCTGAAGA GTTTTCTTAA TAAGGACCCT 1680 

GGATTTGCTG ATATTCAGGG ACACCCGTTC 1740 

CAAAAACAGG TGGTACCTCC CTTTAAACCA 1800 

TTTGATTCTC AGTTTACTAA TGAACCT6TC 1860 

AGGAAGATTG ATCAGTCTGA ATTTGAAGGT 1920 

6CA6AAGAAT GTGTCTGATC CTCATTTTTC 1980 

ATGCATG6AT AAACTTGCTG CAAGCCTGGA 2040 

CAAAAAAACA CCCAATATCT TCTCTTGTAG 2100 

ACTATGAAAA AAAAATTAAT ACTACTAGCT 2160 

CTGGTTTTTC AGTTTTTAAA AGGCCTACAG 2220 
AAAAAAAAAA O 



SEQ ID MO30 OAA1 Proten sequence: 
Protein Accession!: NP.002731 

1 11 21 31 41 51 

I I I I I I 

MSHTVAGGGS GDHSHQVRVK AYYRGDIMIT HFEPSISFEC LCNEVRDMCS FDNEQLFTMK 60 
WIDEEGDPCT VSSQLELEEA FRLYELNKDS ELLIHVFPCV PERPGMPCPG EDKSIYRRGA 120 
RRWRKLYCAN GHTFQAKRFN RRAHCAICTD RIWGLGRQGY KCINCKLLVH KKCHKLVTIE 180 
CGRHSLPQEP VMFMDQSSMH SDHAQTVTPY NPSSHESLDQ VGEEKEAMNT RBSGXASSSL 240 
GLQDFDLLRV IGRGSYAKVL LVRLKKTDRI YAMKWKKEL WDDEDH3W QTEKHVFEQA 300 
SNHPFLVGLH SCPQTESRLF FVIEYVNGGD LMFHMQRQRK LPEEHARFYS AEISLALNYL 360 
HBRGZIYRDL KLDNVLLDSE GHIKLTDYGM CKBGLRPGDT TSTFCGTFNY IAPEILRGED 420 
YGFSVDWWAL GVLMFEMMAG RSPFD1VGSS DKPDQNTEEY LFQVILEKQI R1PRSLSVKA 480 
ASVLKSPLNK DPKERLGCHP QTGFADIQGH PPFRNVDWDH MEQKQWPPF KPNISGEFGL 540 
DKFDSQFTNE PVQLTPDDDD IVRKIDQSEF EGFEYINPLL MSAEECV 



SEQ tDN0:21 08H2 0NA SEQUENCE 

Nucleic Acid Accession*: L05628 

Coding sequence: 197-4792 (undeiBned sequences correspond to start and slop codons) 



1 11 21 31 41 51 

I I I I i I 

CCAGGCGGCG TTGCGGOCCC GGCCCCGGCT CCCTGCGCCG CCGCCGCCGC CGCCGCCGCC 60 

GCCGCOGCCG CCGCCGCCAG CGCTAGCGCC AGCAGCCGGG CCCGATCACC CGCCGCCCGG 120 

TGCCOGCCGC CGCCCGCGCC AGCAACCGGG CCCGATCACC CGCCGCCCGG TGCCCGCCGC 180 

CGCCCGCGCC ACCGGCATGG CGCTCCGGGG CTTCTGCAGC GCCGATGGCT CCGACCCGCT 240 

CTGGGACTGG AATGTCACGT GGAATACCAG CAACCCCGAC TTCACCAAGT GCTTTCAGAA 300 

CACGGTCCTC GTGTGGGTGC CTTGTTTTTA CCTC7GGGCC TGTTTCCCCT TCTACTTCCT 360 

CT A TCTCTCC CGACATGACC GAGGCTACAT TCAGATGACA CCTCTCAACA AAACCAAAAC 420 

TGCCTTGGGA TTTTTGCTGT GGATCGTCTG CTGGGCAGAC CTCTTCTACT CTTTCTGGGA 480 

AAGAAGTCGG GGCATATTCC TGGCCCCAGT GTTTCTGGTC AGCCCAACTC TCTTGGGCAT 540 

CACCACGCTG CTTGCTACCT TTTTAATTCA GCTGGAGAGG AGGAAGGGAG TTCAGTCTTC 600 

AGGGATCATG CTCACTTTCT GGCTGGTAGC CCTAGTGTGT GCCCTAGCCA TCCTGAGATC 660 

CAAAATTATG ACAGCCTTAA AAGAGGATGC CCAGGTGGAC CTGTTTCGTG ACATCACTTT 720 

CTACGTCTAC TTTTCCCTCT TACTCATTCA GCTCGTCTTG TCCTGTTTCT CAGATCGCTC 780 

ACCCCTGTTC TCGGAAACCA TCCACGACCC TAATCCCTGC CCAGAGTCCA GCGCTTCCTT 840 

CCTGTCGAGG ATCACCTTCT GGTGGATCAC AGGGTTGATT GTCCGGGGCT ACCGCCAGCC 900 

CCTGGAGGGC AGTGACCTCT GGTCCTTAAA CAAGGAGGAC ACGTCGGAAC AAGTCGTGCC 960 

TGTTTTGGTA AAGAACTGGA AGAAGGAATG CGCCAAGACT AGGAAGCAGC CGGTGAAGGT 1020 

TGTGTACTCC TCCAAGGATC CTGCCCAGCC GAAAGAGAG7 TCCAAGGTGG ATGCGAATGA 1080 

GGAGGTGGAG GCTTTGATCG TCAAGTCCCC ACAGAAGGAG TGGAACCCCT CTCTGTTTAA 1140 

GGTGTTATAC AAGACCTTTG GGCCCTACTT CCTCATGAGC TTCTTCTTCA AGGCCATCCA 1200 

CGACCTGATG ATGTTTTCCG GGCCGCAGAT CTTAAAGTTG CTCATCAAGT TCGTGAATGA 1260 

CACGAAGGCC CCAGACTGGC AGGGCTACTT CTACACCGTG CTGCTGTTTG TCACTSOCTC 1320 

CCTGCAGACC CTCGTGCTGC ACCAGTACTT CCACATCTGC TTCGTCAGTG GCATGAGGAT 1380 

CAAGACCGCT GTCATTGGGG CTGTCTATCG GAAGGCCCTG GTGATCACCA ATTCAGCCAG 1440 

AAAATCCTCC ACGGTCGGGG AGATTGTCAA CCTCATGTCT GTGGACGCTC AGAGGTTCAT 1500 

GGACTTGGCC ACGTACATTA ACATGATCTG GTCAGCCCCC CTGCAAGTCA TCCTTGCTCT 1560 

CTACCTCCTG TGGCTGAATC TGGGCCCTTC CGTCCTGGCT GGAGTGGCGG TGATGGTCCT 1620 

CATGGTGCCC GTCAATGCTG TGATGGCGAT GAAGACCAAG ACGTATCAGG TGGCCCACAT 1680 

GAAGAGCAAA GACAATCGGA TCAAGCTGAT GAACGAAATT CTCAATGGGA TCAAAGTGCT 1740 

AAAGCTTTAT GCCTGGGAGC TGGCATTCAA GGACAAGGTG CTGGCCATCA GGCAGGAGGA 1800 

GCTGAAGGTG CTGAAGAAGT CTGCCTACCT GTCAGCCGTG GGCACCTTCA CCTGGGTCTG 1860 

CACGCCCTTT CTGGTGGCCT TGTGCACATT TGCCGTCTAC GTGACCATTG ACGAGAACAA 1920 

CATCCTGGAT GCCCAGACAG CCTTCGTGTC TTTGGCCTTG TTCAACATCC TCCGGTTTCC 1980 

CCTGAACATT CTCCCCATGG TCATCAGCAG CATCGTGCAG GCGAGTGTCT CCCTCAAACG 2040 

CCTGAGGATC TTTCTCTCCC ATGAGGAGCT GGAACCTGAC AGCATCGAGC GACGGCCTGT 2100 

CAAAGACGGC GGGGGCACGA ACAGCATCAC CGTGAGGAAT GCCACATTCA CCTGGGCCAG 2160 

GAGCGACCCT CCCACACTGA ATGGCATCAC CTTCTCCATC CCCGAAGGTG CTTTGGTGGC 2220 
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CGTGGTGGGC CAGGTGGGCT GCGGAAAGTC GTCCCTGCTC TCAGCCCTCT TGGCTGAQAT 22B0 

GGACAAAGTG GAGGGGCACG TGGCTATCAA GGGCTCCGTG GCCTATGTGC CACAGCAGGC 2340 

CTGGATTCAG AATGATTCTC TCCGAGAAAA CATCCTTTTT GGATGTCAGC TGGAGGAACC 2400 

ATATTACAGG TCCGTGATAC AGGCCTGTGC CCTCCTCCCA GACCTGGAAA TCCTGCCCAG 2460 

TGGGGATCGG ACAGAGATTG GCGAGAAGGG CGTGAACCTG TCTGGGGGCC AGAAGCAGCG 2520 

CGTGAGCCTG GCCCGGGCCG TGTACTCCAA CGCTGACATT TACCTCTTCG ATGATCCOCT 2580 

CTCAGCAGTG GATGCCCATG TGGGAAAACA CATCTTTGAA AATGTGATTG GCCCCAAGGG 2640 

GATGCTGAAG AACAAGACGC GGATCTTGGT CACGCACAGC ATGAGCTACT TGCCGCAGGT 2700 

GGACGTCATC ATCGTCATGA GTGGCGGCAA GATCTCTGAG ATGGGCTCCT ACCAGGAGCT 2760 

GCTGGCTCGA GACGGCGCCT TCGCTGAGTT CCTGCGTACC TATGCCAGCA CAGAGCAGGA 2820 

GCAGGATGCA GAGGAGAACG GGGTCACGGG CGTCAGCGGT CCAGGGAAGG AAGCAAAGCA 2880 

AATGGAGAAT GGCATGCTGG TGACGGACAG TGCAGGGAAG CAACTGCAGA GACAGCTCAG 2940 

CAGCTCCTCC TCCTATAGTG GGGACATCAG CAGGCACCAC AACAGCACCG CAGAACTGCA 3000 

GAAAGCTGAG GCCAAGAAGG AGGAGACCTG GAAGCTGATG GAGGCTGACA AGGCGCAGAC 3060 

AGGGCAGGTC AAGCTTTCCG TGTACTGGGA CTACATGAAG GCCATCGGAC TCTTCATCTC 3120 

CTTCCTCAGC ATCTTCCTTT TCATGTGTAA CCATGTGTCC GCGCTGGCTT CCAACTATTG 3180 

GCTCAGCCTC TGGACTGATG ACCCCATCGT CAAOGGGACT CAGGAGCACA CGAAAGTCCG 3240 

GCTGAGCGTC TATGGAGCCC TGGGCATTTC ACAAGGGATC GCCGTGTTTG GCTACTCCAT 3300 

GGCCGTGTCC ATCGGGGGGA TCTTGGCTTC CCGCTGTCTG CACGTGGACC TGCTGCACAG 3360 

CATCCTGCGG TCACCCATGA GCTTCTTTGA GCGGACCCCC AGTGGGAACC TGGTGAACCG 3420 

CTTCTCCAAG GAGCTGGACA CAGTGGACTC CATGATOCCG GAGGTCATCA AGATGTTCAT 3480 

GGGCTCOCTG TTCAACGTCA TTGGTGCCTG CATCGTTATC CTGCTGGOCA CGCCCATCGC 3S40 

CGCCATCATC ATCCCGCCCC TTGGCCTCAT CTACTTCTTC GTCCAGAGGT TCTACGTGGC 3600 

TTCCTCCCGG CAGCTGAAGC GCCTCGAGTC GGTCAGCCGC TCCCCGGTCT ATTCCCATTT 3660 

CAACGAGACC TTGCTGGGGG TCAGCGTCAT TCGAGCCTTC GAGGAGCAGG AGCGCTTCAT 3720 

CCACCAGAGT GACCTGAAGG TGGACGAGAA CCAGAAGGCC TATTACCCCA GCATCGTGGC 3780 

CAACAGGTGG CTGGCCGTGC GGCTGGAGTG TGTGGGCAAC TGCATCGTTC TGTTTGCTGC 3840 

CCTGTTTGCG GTGATCTCCA GGCACAGCCT CAGTGCTGGC TTGGTGGGCC TCTCAGTGTC 3900 

TTACTCATTQ CAGGTCACCA CGTACTTGAA CTGGCTGGTT CGGATGTCAT CTGAAATGGA 3960 

AACCAACATC - GTGGCCGTGG AGAGGCTCAA GGAGTATTCA GAGACTGAGA AGGAGGCGCC 4020 

CTGGCAAATC CAGGAGACAG CTCCGCCCAG CAGCTGGCCC CAGGTGGGCC GAGTGGAATT 4080 

CCGGAACTAC TGCCTGCGCT ACCGAGAGGA CCTGGACTTC GTTCTCAGGC ACATCAATGT 4140 

CACGATCAAT GGGGGAGAAA AGGTCGGCAT CGTGGGGCGG ACGGGAGCTG GGAAGTCGTC 4200 

CCTGACCCTG GGCTTATTTC GGATCAACGA GTCTGCCGAA GGAGAGATCA TCATCGATGG 4260 

CATCAACATC GCCAAGATCG GCCTGCACGA CCTCCGCTTC AAGATCACCA TCATCCCCCA 4320 

GGACCCTGTT TTGTTTTCGG GTTCCCTCCG AATGAACCTG GAOCCATTCA GCCAGTACTC 4380 

GGATGAAGAA GTCTGGACGT CCCTGGAGCT GGCCCACCTG AAGGACTTCG TGTCAGCCCT 4440 

TCCTGACAAG CTAGACCATG AATGTGCAGA AGGCGGGGAG AACCTCAGTG TCGGGCAGCG 4500 

CCAGCTTGTG TGCCTAGCCC GGGCCCTGCT GAGGAAGACG AAGATCCTTG TGTTGGATGA 4560 

GGCCACGGCA GCCGTGGACC TGGAAACGGA CGACCTCATC CAGTCCACCA TCCGGACACA 4620 

GTTCGAGGAC TGCACCGTCC TCACCATCGC CCACCGGCTC AACACCATCA TGGACTACAC 4680 

AAGGGTGATC GTCTTGGACA AAGGAGAAAT CCAGGAGTAC GGCGCCCCAT CGGACCTCCT 4740 

GCAGCAGAGA GGTCTTTTCT ACAGCATGGC CAAAGACGCC GGCTTGGTGT GAG OOCCAGA 4800 

GCTGGCATAT CTGGTCAGAA CTGCAGGGCC TATATGCCAG CGCCCAGGGA GGAGTCAGTA 4860 

CCCCTGGTAA ACCAAGCCTC -CCACACTGAA ACCAAAACAT AAAAACCAAA CCCAGACAAC 4920 

CAAAACATAT TCAAAGCAGC AGCCACCGCC ATCCGGTCCC CTGCCTGGAA CTGGCTGTGA 4980 
AGACCCAGGA GAGACAGAGA TGCGAACCAC C 



SEQ ID N(h22 0BH2Pro!dn sequence: 
Pnrtdn Accession*: AAB46616 

1 U 21 31 41 51 

I I I I I I 

MALRGFCSAD GSDPLWDWNV TWNTSNPDFT KCFQNTVLVW VPCFYLWACF PFYFLYLSRH 60 

DRGYIQMTPL NKTKTALGFL DWIVCWADLF YSPWBRSRGI FLAFVFLVSP TLLGITTLLA 120 

TFLIQLERRK GVQSSGIMLT FWLVALVCAL AILRSKHJTA LKEDAQVDLP RDITFYVYFS 180 

LLLIQLVLSC FSDRSPLFSE TIHDPNPCPE SSASFLSRZT FWWITGLIVR GYRQPLEGSD 240 

LWSLNKEDTS EQWPVLVKN WKKECAKTRK QPVKWYSSK DPAQPKESSK VDANEEVEAL 300 

XVKSPQKEWN PSLFKVLYXT FGPYFLMSFF FKAXHDLMKF SGPQILKLLI KFVKDTKAFD 360 

WQGYFYTVLL FVTACLQTLV LHQYPHICFV SGMRIKTAVI GAVYRKALVI TOSARKSSW 420 

GEIVNLMSVD AQRFMDUATV BWIWSAPLQ VILALYLLWL HLGPSVLAGV AVMVUSVPVN 480 

AVMAMKTKTY QVAHMKSKDN RIKLMNEILN GIKVLKLYAW ELAFKDKVLA IRQEELKVLK 540 

KSAYLSAVGT FTWVCTPFLV ALCTFAVYVT IDENNILDAQ TAFVSLALFN ILRFPLNILP 600 

HvrrssrvcAS VSLKRLRIFL SHEELEFDSI ERRPVKDGGG TNSITVRNAT FTWARSDPPT 660 

LNGITFSIFE GALVAWGQV GCGKSSLLSA LLAEMDKVEQ HVAIKGSVAY VPQQAWIQND 720 

SLRENILFGC QLEEPYYRSV IQACALLPDL EILPSGDRTB IGEKGVNLSG GQKQRVSLAR 780 

AWSNADIYL FDDPLSAVDA HVGKHIFENV IGPKGMLKNK TRILVTHSMS YLPQVDVIIV 840 

MSGGKISEMG SYQELLARDG AFABFLRTYA STEQEQDAEE NGVTGVSGPG KEAKQMEKGM 900 

LVTDSAGKQL QRQLSSSSSY SGDISRHHNS TAELQKAEAK KEETWKLMEA DKAQTGQVKL 960 

SVYWDYMKAI GLFISFLSIF LFMCNHVSAL ASNYWL5LWT DDPIVNGTQE HTKVRLSVYG 1020 

ALGISQGIAV FGYSMAVSIG GILASRCLHV DLLHSILRSP MSFFERTPSG KLVNRFSKEL 1080 

DTVPSMIPEV HtKFMGSLFN VIGACIVILL ATPIAAIIIP PLGLIYFFVQ RFYVASSRQL 1140 

KRLESVSRSP VYSHFNETLL GVSVZ RAPHE QERFIHQSDL KVDENQKAYY PSIVANRWIA 1200 

VRLECVGNCI VLPAALPAVI SRHSLSAGLV GLSVSYSLQV TTXXNWLVRM SSEMETNTVA 1260 

VERLKEYSET EKEAPWQIQE TAPPSSWPQV GRVEFRNYCL RYREDLDFVL RHINVTINGG 1320 

EKVGXVGRTG AGKSSLTLGL FRUJESAEGE IIIDGINIAK IGLHDLRPKI TIIPQDPVLF 1380 

SGSLRMNLDP FSQYSDEEVW TSLELAHLKO FVSALPDKLD HECAEGGENL SVGQRQLVCL 1440 

ARALLRKTKI LVLDEATAAV DLETDDLIQS TIRTQFEDCT VLTIAHRLNT IMDYTRVTVL 1500 
DKGEIQEYGA PSDLLQQRGE FYSMAKDAGL V 
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SEQ ID N023 PAA2 DNA SEQUENCE 

Nucleic Add Accession #: NM.013309 

Coding sequence: M290 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

1 I I I I I 

ATCGCCGGCT CTGGCGCGTG GAAGCGCCTC AAATCTATGC TAAGGAAGGA TGATGCGCCG 60 

CTGTTTTTAA ATGACACCAG CGCCTTTGAC TTCTCGGATG AGGCGGGGGA CGAGGGGCTT 120 

TCTCGGTTCA ACAAACTTCG AGTTGTGGTG GCCGATGACG GTTCCQAAGC CCCGGAAAGG 180 

CCTGTTAACG GGGCGCACCC GACCCTCCAG GCCGACGATG ATTCCTTACT GGACCAAGAC 240 

TTACCTTTGA CCAACAGTCA GCT6AGTTTG AAGGTOGACT CCTGTGACAA CTGCA6CAAA 300 

CAGAGAGA6A TACTGAAGCA GAGAAAGGTG AAAGCCAGGT TGACCATTGC TGCCGTTCTG 360 

TACTTGCTTT TCATGATTGG AGAACTTGTA GGTGGATACA TTGCAAATAG CCTAGCAATC 420 

ATGACAGATG CACTTCATAT GTTAACTGAC CTAAGCGCCA TCATACTCAC CCTGCTTGCT 480 

TTGTGGCTAT CATCAAAATC ACCAACCAAA AGATTCACCT TTGGATTTCA TCGCTTAGAG 540 

GTTTTGTCAG CTATGATTAG TGTGCTGTTG GTGTATATAC TTATGGGATT CCTCTTATAT 600 

GAAGCTGTGC AAAGA ACTA T CCATATGAAC TATGAAATAA ATGGAGATAT AATGCTCATC 660 

ACCGCAGCTG TTGGAGTTGC AGTTAATGTA ATAATGGGGT TTCTGTTGAA OCAGTCTGGT 720 

CACCGTCACT CCCATTCCCA CTCCCTGCCT TCAAATTCCC CTACCAGAGG TTCTGGGTCT 780 

GAACOTAACC ATGGGCAGGA TAGCCTGGCA GTGAGAGCTG CATTTGTACA TGCTTTGGGA 840 

GATTTGGTAC AGAGTGTTGG TGTGCTAATA GCTGCATACA TCATACGATT CAAGCCAGAA 900 

TACAAGATTG CTGATCCCAT CTGTACATAC GTATTTTCAT TACTTGTGGC TTTTACAACA 960 

TTTCGAATCA TATGGGATAC ACT AGITATA ATACTAGAAG GTGTGCCAAG CCATTTGAAT 1020 

GTAGACTATA TCAAAGAAGC CTTGATGAAA ATAGAAGATG TATATTCAGT CGAAGATTTA 1080 

AATATCTGGT CTCTCACTTC AGGAAAATCT ACTGCCATAG TTCACATACA GCTAATTCCT 1140 

GGAAGTTCAT CTAAATGGGA GGAAGTACAG TCCAAAGCAA ACCATTTATT ATTGAACACA 1200 

TTTGGCATGT ATAGATGTAC TATTCAGCTT CAGAGTTACA GGCAAGAAGT GQACAGAACT 1260 
TGTGCAAATT GTCAGAGTTC TAGTCCCTGA 



SEQPH0^4PAA2Prtfein sequence 
Protein Accession #: NP_037441 

1 11 21 31 41 51 

I 1 I I I I 

MAGSGAWKRL KSMLRKDDAP LFLNDTSAFD FSDEAGDEGL SRFNKLRVW ADDGSEAPER 60 
PVNGAHPTLQ ADDDSLLDQD LPLTNSQLSL KVDSCDNCSK QREILKQRKV KARLTZAAVL 120 
YLLFMIGELV GGYIANSLAI MTDALHHLTD LSAIILTLLA LWLSSKSPTK RFTFGFHRLE 180 
VLSAMISVLL VYILMGFLLY EAVQRTIHMN YEINGDIMLI TAAVGVAVNV ZKGFLLNQSG 240 
HRHSH5E5LF SNSPTRGSGC ERNHGQDSLA VRAAFVHALG DLVQSVGVLI AAYIZRFKFS 300 
YKXADPICTY VFSLLVAFTT FRIIWDTWI ILEGVPSRXN VDYIKEALMK IEDVYSVEDL. 360 
NIWSLTSGKS TAIVHIQLIP GSSSKWEBVQ SKANHLLLNT FGMYRCTIQL QSYRQEVDRT 420 
CANCQSSSP 



SEQ © N025 PAA3 DNA SEQUENCE 

Nucleic Add Accession #/. AB037785 

Coding sequence: 375-2788 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

lilt)! 

GCCGAGTCGG TGGCGGCTGC AGGCTGGGAG GGAGAAGTGC TACGOCTTTG CAGGTTGGCG 60 

AAGTGGTTCC AGGCTACCCG GCTAGTCTGG CACGGOCCOG TCTTCTGCOT CCTCCTCCGT 120 

CGCGTGGCGG OGGGAACTGT TGGCCGCGCG GCCTCGGGAA CGGCCCAGGT OCCCGCCCGC 180 

AGGTCCCGGG CAGATAACAT AGATCATCAG TAGAAAACTT CTTGAAGTTG TTCAAGAAAA 240 

ATTTGAAAGT AGCAAAATAG AAAATAAAGA ATTAACAGCA GATACAGAGG ACAGCATGGA 300 

AGTGTTGTCT TAGGAAACAG AACACAGCAG TGAAAAAACA GACAAAATCC GCTCAGATAC 360 

AACTGCAGCT GATAATGTTT TCCGGCTTCA ATGTCTTTAG AGTTGGGATC TCTTTTGTCA 420 

TAATGTGCAT TTTTTACATG CCAACAGTAA ACTCTTTACC AGAACTGAGT CCTCAGAAAT 480 

ATTTTAGTAC ATTGCAACCA GGTCTTGAAG AACTGAATGA GGCTGTTAGA CCTCTGCAGG 540 

ACTATGGAAT TTCAGTTGCC AAGGTTAATT GTGTCAAAGA AGAAATATCA AGATACTGTG 600 

GAAAAGAAAA GGATTTGATG AAAGCATATT TATTCAAGGG CAACATATTG CTCAGAGAAT 660 

TCCCTACTGA CACCTTGTTT GATGTGAATG OCATTGTCGC CCATGTTCTC TTTGCTCTTC 720 

TTTTTAGTGA AGTGAAATAT ATTACCAACC TGGAAGACCT TCAGAACATA GAAAATGCTC 780 

TGAAAGGAAA AGCAAATATT ATATTCTCAT ATGTAAGAGC CATTGGAATA CCAGAGCACA 840 

GAGCAGTCAT GGAAGCCGGT TTTGTGTATG GGACTACATA CCAATTTGTC TTAAOCACAG 900 

AAATTGCCCT TTTGGAAAGT ATTGGCTCTG AGGATGTGGA ATATGCACAT CTCTACTTTT 960 

TTCATTGTAA ACTAGTCTTG GACTTGACCC AGCAATGTAG AAGAACACTA ATGGAACAGC 1020 

CATTGACTAC ACTGAACATT CACCTGTTTA TTAAGACAAT GAAAGCACCT CTGTTGACTG 1080 

AAGTTGCTGA AGATCCTCAA CAAGTTTCAA CTGTCCATCT CCAACTGGGC TTACCACTGG 1140 

TTTTTATTGT TAGCCAACAG GCTACTTATG AAGCTGATAG AAGAACTGCA GAATGGGTTG 1200 

CTTGGCGTCT TCTGGGAAAA GCAGGAGTTC TACTCTTGTT AAGGGACTCT TTGGAAGTGA 1260 

ACATTCCTCA AGATGCTAAT GTGGTCTTCA AAAGAGCAGA AGAGGGAGTT CCAGTGGAAT 1320 

TTTTGGTATT ACATGATGTT GATTTAATAA TATCTCATGT GGAAAATAAT ATGCACATTG 1380 

AGGAAATACA AGAAGATGAA GACAATGACA TGGAAGGTCC AGATATAGAT GTTCAGGATG 1440 

ATGAAGTGGC AGAAACTGTT TTCAGAGATA GGAAGAGAAA ATTACCTTTG GAACTTACAG 1500 
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TGGAACTAAC AGAAGAAACA TTTAATGCAA CAGTGATGCC TTCTGACAGC ATAGTACTCT 1560 

TCTATGCTGG TTGGCAAGCA GTATCCATGG CATTTTTGCA ATCCTATATT GATGTGGCAG 1620 

TTAAACTGAA AGGCACATCT ACTATGCTTC TTACTAGAAT AAACTGTGCA GATTGGTCTG 1680 

ATGTATGTAC TAAGCAAAAT GTTACTGAAT TTCCTATCAT AAAGATGTAC AAGAAAGGCG 1740 

AGAACCCAGT ATCTTATGCT GGAATGTTAG GAACCAAAGA TCTCCTAAAA TTTATCCAGC 1800 

TCAACAGGAT TTCATATCCA GTGAATATAA CATCGATCCA AGAAGCAGAA GAATATTTAA 1860 

GTGGGGAATT ATATAAAGAC CTCATCTTGT ATTCTAGTGT GTCAGTATTG GGACTATTTA 1920 

GTCCAACCAT GAAAACAGCA AAAGAAGATT TTAGTGAAGC AGGAAACTAC CTAAAAGGAT 1980 

ATGTTATCAC TGGAATTTAT TCTGAAGAAG ATGTTTTGCT ACTGTCAACC AAATATGCTG 2040 

CAAGTCTTCC AGCCCTGCTG CTTGCCAGAC ACACAGAAGG CAAAATAGAG AGCATCCCAC 2100 

TAGCTAGCAC ACATGCACAA GACATAGTTC AAATAATAAC AGATGCACTA CTGGAAATGT 2160 

TTCCGGAAAT CACTGTGGAA AATCTTCCCA GTTATTTCAG ACTTCAGAAA CCATTATTGA 2220 

TTTTGTTCAG TGATGGCACT GTAAATCCTC AATATAAAAA AGCAATATTG ACACTGGTAA 2280 

AGCAGAAATA CTTGGATTCA TTTACTCCAT GCTGGTTAAA TCTAAAGAAT ACTCCAGTGG 2340 

GGAGAGGAAT CTTGCGGGCA TATTTTGATC CTCTGCCTCC CCTTOCTCTT CTTQTTTTC Q 2400 

TGAATCTGCA TTCAGGTGGC CAAGTATTTG CATTTCCTTC AGACCAGGCT ATAATTGAAG 2460 

AAAACCTTGT ATTGTGGCTG AAGAAATTAG AAGCAGGACT AGAAAATCAT ATCACAATTT 2520 

TACCTGCTCA AGAATGGAAA OCTCCTCTTC CAGCTTATGA TTTTCTAAGT ATGATAGATG 2580 

CCGCAACATC TCAACGTGGC ACTAGGAAAG TTCCCAAGTG TATGAAAGAA ACAGATGTGC 2640 

AGGAGAATGA TAAGGAACAA CATGAAGATA AATCGGCAGT CAGAAAAGAA CCGATTGAAA 2700 

CTCTGAGAAT AAAGCATTGG AATAGAAGTA ATTGGTTTAA AGAAGCAGAA AAATCATTTA 2760 

GACGTGATAA AGAGTTAGGA TGCTCAAAAG TGAACTAATT TTATAGGGCT GTGGTTTCCA 2820 

AAATTTTTTT GGCATGATAG ACTTAATTTA TTTCCTTAAA GAATAATATT AAATCATTTC 2880 

AAGTTTGCAG ACTAGTGCCA TCCAATAGAA TTATAATATA AGTCACATAT TTTATTTAAA 2940 

ATTTTCTAGT AACTACATTA AACAAAGTAA AAGTGAGCAG GGCAAAATAA TTTTGATATT 3000 

ACTTTTCACC CAGTAGTATA CCCAAAATAG CGAAATATAG AAATTATTAA TGAGATATTT 3060 

TACATCCTTT TTTGTACCAA GTCTTCTAAA TGCAGTACAT ATTTTATACT TACTGCATTT 3120 

CTTACTTCCG AGTAGCCATA TTTCAAGTGT TCATTGCCAC ATGTGGCCTG TGACTACTGT 3180 

ATTGGACAGT TCAGTACTAG ACAAAAACTA GCATAATTAA CTTAGTTCTA GCCATGATTT 3240 

CTATTTGGAT TAAAATTAAA CTCTAATCAC AGTTAACTCC ACAGTGCATT CATGCAGCTG 3300 

ACAGTTATAT TTGTTTTATT GGAGTCATGA TATTAAAATC AGC5GTTTGTC AACCTCAGGG 3360 

GATATTTAGC AATTGTCGGG AGACATTTTT GATGTCATGA CTAGGGCAGT TATTGACATT 3420 

TAGTGAGTAG AGGCCATGGA TCCTGCTAAA TAACCTGCAT TGGACAGCGC CCCACAACAA 3480 

AGAATTATCC TGCCCGAAAT GGTAGTCGTG CCAAGGCTGA GTAACCTTGT GTTAAAAGTA 3540 

ACCTGTGGCA GACTAGGTTT CCAGAATTTC CTGGTTCTGC TCACGTATCA TGTTTGAAAA 3600 

AATTTTGGCT ATTAAAGATA TGTATTAGAT GGTCTTATCC TGA1TATTAC CTGGATACAA 3660 

CTTGATCTTT TCTAATATTT TCAGAAAGTG ATGGGATAAC CCTAGAAGAG GACTCAGAAT 3720 

GATATTTATA TTTTAAGTGA GTCTTAAAAC CTCCTCTTAT TTCTACAAGT TATATGGCTA 3780 

AATTTCAGAT TGAACAGGGA TTCAGCATTC TGCCATCTCC TCATGGAAAG AGAGGCTCCC 3840 

TCATCTGAAG CGTCTCTGAA ATCTACCCTT GCAAGCTTCA GACAAATCAG TTGATCTCCC 3900 

TGAGCCACAC GGCCTCATTC TGTGAGGGAG GGAAAGATTA GCCAAAGAGT TAATTTTCAT 3960 

TCCAAATCAC TTAGCTGTTA GACTGATCTG TTTGTAGCAG TTGTTTGTCT CATTTTTGCT 4020 

CTGTGCATTT TTTGAGACAT TTGTTGAGAA TATTCTATTT GGTGCTCTAC TGTATTTTTC 4080 

TTTTTAATAT CTACTTGATA TCTTGTTCTT TAAATTTTCT TCACATATGG TTTGCCTGAT 4140 

ACAACTGATT TTTATAACTG AAATTTAAGG AATCTAACAG CTAAAACTCA GTAAGTGCAT 4200 

MTATTTCCTT ATAACATAGA CCCGTTGCTA CTCTCAGCAC CCTCTOCTCA ATTTTTTTTC 4260 

CTGTAGCATG TGATGCCTGA TPAAACTCAT TTTCATTTGC TTTTATTTCT AATATGGGAA 4320 

CAATGAGAGT GAACTCTAAA TATAGGTTGT AGTAAIAAAA CATCATTAGC CTAATTATTA 4380 

GAAAATGCTA ATTAAGTACC AGCACATAGA AACATGAAAT TGCTTAGTCA TTGTACCTTT 4440 

GTCAGCAATT TTGACAGTCA TTAATGTTTG TCATAATTTT AAATAAAGTG TCTGGGTTTC 4500 
AGAATACCTT CAAAAAAAAA AAAAAA 

Protein Accession*: BAA925B2 

1 11 21 31 41 51 

I I I 1 I I 

MFSGFNVFKV GISFVIMCIF YMPTVNSLPE LSPQKYFSTL QPGLEELNEA VRPLQDYGIS 60 

VAKVNCVKEE ISRYCGKEKD LMKAYLFKGN ILLREFPTDT LFDVNAIVAH VLFALLFSEV 120 

KYITNLEDLQ NIENALKGKA HIIPSYVRAI GIPEHRAVME AGFVYGTTYQ FVLTTEIALL 180 

ESIGSEDVEY AHLYFPHCKL VLDLTQQCRR TLMEQPLTTL NIHLFIKTMK APLLTEVAED 240 

PQQVSTVHLQ .LGLPLVFZVS QQATYEADRR TAEWVAMRLl* GKAGVLLLLR DSLEVNIPQD 300 

ANWFKRAEE GVFVEFLVLH DVDLIISHVE NNMHIEEIQE DEDNDHEGPD IDVQDDEVAE 360 

TVFRDEKRKL PLELTVELTE ETFNATVMAS DSIVLFYAGW QAVSMAFLQS YIDVAVKLKG 420 

TSTKLLTRtN CADWSDVCTK QNVTEFPIIK MYKKGENPVS YAGMLGTKDL LKFIQLNRIS 480 

YPVNITSIQE AEEYLSGELY KDLILYSSVS VLGLFSPTMK TAKEDFSEAG NYLKGYVITG 540 

IYSEE0VLLL STKYAASLPA LLLARHTEGK IESIPLASTH AQDIVQIITD ALLEMPPEIT 600 

VENLPSYFRL QKPLLILFSD GTVNPQYKKA ILTLVKQKYL DSFTPCWLNL KNTPVGRGIL 660 

RAYFDPLPPL PLLVLVNLHS GGQVFAFPSD QAIIEENLVL WLKKLEAGLE NHITILPAQE 720 

WKPPLPAYDF LSMIDAATS0 RGTRKVPKCM KETDVQENDK EQHEDKSAVR KEPIETLRIK 780 
HWNRSNWFKE AEKSFRRDKE LCCSKVN 

SEQ ID N&27 PAA5 DNA SEQUENCE 

Nucleic Acid Accession t: NM.012449 

Coding sequence: 65-1085 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

1 * 1 1 1 1 

CCGAGACTCA CGGTCAAGCT AAGGCGAAGA GTGGGTGGCT GAAGCCATAC TATTTTATAG 60 

AATTAATGGA AAGCAGAAAA GACATCACAA ACCAAGAAGA ACTTTGGAAA ATGAAGCCTA 120 
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GGAGAAATTT AGAAGAAGAC GATTATTTGC ATAAGGACAC GGGAGAGACC AGCATGCTAA 180 

AAAGACCTGT GCTTTTGCAT TTGCACCAAA CAGCCCATGC TGATGAATTT GACTGCCCTT 240 

CAGAACTTCA GCACACACAG GAACTCTTTC CACAGTGGCA CTTGCCAATT AAAATAGCTG 300 

CTATTATAGC ATCTCTGACT TTTCTTTACA CTCTTCTGAG GGAAGTAATT CACCCTTTAG 360 

CAACTTCCCA TCAACAATAT TTTTATAAAA TTCCAATCCT GGTCATCAAC AAAGTCTTGC . 420 

CAATGGTTTC CATCACTCTC TTGGCATTGG TTTACCTGCC AGGTGTGATA GCAGCAATTG 480 

TCCAACTTCA TAATGGAACC AAGTATAAGA AGTTTCCACA TTGGTTGGAT AAGTGGATGT 540 

TAACAAGAAA GCAGTTTGGG CTTCTCAGTT TCTTTTTTGC TGTACTGCAT GCAATTTATA 600 

GTCTGTCTTA CCCAATGAGG CGATCCTACA GATACAAGTT GCTAAACTGG GCATATCAAC 660 

AGGTCCAACA AAATAAAGAA GATGCCTGGA TTGAGCATGA TGTTTGGAGA ATGGAGATTT 720 

ATGTGTCTCT GGGAATTGTG GGATTGGCAA TACTGGCTCT GTTGGCTGTG ACATCTATTC 780 

CATCTGTGAG TGACTCTTTG ACATGGAGAG" AATTTCACTA TATTCAGAGC AAGCTAGGAA 840 

TTGTTTCCCT TCTACTGGGC ACAATACACG CATTGATTTT TGCCTGGAAT AAGTGGATAG 900 

ATATAAAACA ATTTGTATGG TATACACCTC CAACTTTTAT GATAGCTQTT TTCCTTCCAA 960 

TTGTTGTCCT GATATTTAAA AGCATACTAT TCCTGCCATG CTTGAGGAAG AAGATACTGA 1020 

AGATTAGACA TGGTTGGGAA GACGTCACCA AAATTAACAA AACTGAGATA TGTTCCCAGT 1080 

TGTAGAATTA CTGTTTACAC ACATTTTTGT TCAATATTGA TATATTTTAT CACCAACATT 1140 
TCAAGTTTGT ATTTGTTAAT AAAATGATTA TTCAAGGAAA AAAAAAAAAA AAAAA 

SBQ ID N028 PAA5 Protdn sequence 
Protein Accession* NP_036581 

1 11 21 31 41 51 

I I I I I I 

MESRKDITNQ EELWKMKPRR NLEEDDYLHK DTGETSMLKR PVLLHLHQTA HADBFDCPSE 60 

LQETQELFPQ WHLPIKIAAI IASLTFLYTL LREVIHPLAT SHQQYFYKIP ILVINKVLPH 120 

VSITLLALW LPGVIAAIVQ LHNGTKYKKF PHWLDKWMLT RKQPGLLSFF FAVLHAIYSL 180 

SYPMRRSYKY KLLNWAYQQV QQNKEDAWIE HDVWRHEIYV SLGIVGLAIL ALLAVTSIPS 240 

VSDSLTWREF HYIQSKLGIV SLLLGTIHAL IFAMHKWIDI KQFVWYTPPT FMIAVFLPIV 300 
VLIFKSILFL PCLRKKILKI RHGWEDVTKI KRTEZCSQL 

SEQ ID NO:29 PAA7 DMA SEQUENCE 

Nudeic Arid Accession*: NM.030774 

Coding sequence: 1-883 (underlined sequences correspond to start and stop codons) 

l 11 21 31 41 51 

I ! I I I I 

ATGAGTTCCT GCAACTTCAC ACATGCCACC TTTGTGCTTA TTGGTATCCC AGGATTAGAG 60 

AAAGCCCATT TCTGGGTTGG CTTCCCCCTC CTTTCCATGT ATGTAGTGGC AATGTTTGGA 120 

AACTGCATCG TGGTCTTCAT CGTAAGGACG GAACGCAGCC TGCACGCTCC GATGTACCTC 180 

TTTCTCTGCA TGCTTGCAGC CATTGACCTG GCCTTATCCA CATCCACCAT GCCTAAGATC 240 

CTTGCCCTTT TCTGGTTTGA TTCCCGAGAG ATTAGCTTTG AGGCCTGTCT TACCCAGATG 300 

TTCTTTATTC ATGCCCTCTC AGCCATTGAA TCCACCATCC TGCTGGCCAT GGCCTTTGAC 360 

CGTTATGTGG CCATCTGCCA CCCACTGOGC CATGCTGCAG TGCTCAACAA TACAGTAACA 420 

GCCCAGATTG GCATCGTGGC TGTGGTCCGC GGATCCCTCT TTTTTTTCCC ACTGCCTCTG 480 

CTGATCAAGC GGCTGGCCTT CTGCCACTCC AATGTCCTCT CGCACTCCTA TTGTGTCCAC 540 

CAGGATGTAA TGAAGTTGGC CTATGCAGAC ACTTTGCCCA ATGTGGTATA TGGTCTTACT 600 

GCCATTCTGC TGGTCATGGG CGTGGACGTA ATGTTCATCT CCTTGTCCTA TTTTCTGATA 660 

A2JACGAACGG TTCTGCAACT GCCTTCCAAG TCAGAGCGGG CCAAGGCCTT TGGAACCTGT 720 

GTGTCACACA TTGGTGTGGT ACTCGCCTTC TATGTGCCAC TTATTGGCCT CTCAGTGGTA 780 

CACCGCTTTG GAAACAGCCT TCATCCCATT GTGCGTGTTG TCATGGGTGA CATCTACCTG 840 

CTGCTGCCTC CTGTCATCAA TCCCATCATC TATGGTGCCA AAACCAAACA GATCAGAACA 900 

CGGGTGCTGG CTATGTTCAA GATCAGCTGT GACAAGGACT TGCAGGCTGT GGGAGGCAAG 960 

TGACCCTTAA CACTACACTT CTCCTTATCT TTATTGGCTT GATAAACATA ATTATTTCTA 1020 

ACACTAGCTT ATTTCCAGTT GCCCATAAGC ACATCAGTAC TTTTCTCTGG CTGGAATAGT 1080 

AAACTAAAGT ATGGTACATC TACCTAAAGG ACTATTATGT GGAATAATAC ATACTAATGA 1140 

AGTATTACAT GATTTAAAGA CTACAATAAA ACCAAACATG CTTATAACAT TAAGAAAAAC 1200 

AATAAAGATA CATGATTGAA ACCAAGTTGA AAAATAGCAT ATGCCTTGGA GGAAATGTGC 1260 

TCAAATTACT AATGATTTAG TGTTGTCCCT ACTTTCTCTC TCTTTTTTCT TT C T T TTTTT 1320 

TTTATTATGG TTAGCTGTCA CATACAACTT I ' mTlTO'l' TGAGATGGGG TCTCGCTCTG 1380 

TCACCAGGCT GGAGTGCAGT GGCGCGATCT CGGCTCACTG CAACCTCCAC ATCCCATGTT 1440 

GAAGTAATTC TTCTGCCTCA GCCTCCCGAG TAGCTGGGAC TAGAGGAACG TGCCACCATG 1500 

ACTGGCTAAT TTTCTGTATT TTTTAGTAGA GACAGAGTTT CACCATGTTG GCCAGGATGG 1560 

TCTCGATCTC CTGACCTTGT GATCCACCCG CCTCAGCCTC CCAAAGTGTT GGGATTACAG 1620 

GTGTGAACCA CTGTGCCCGG CCTGTGTACA ACTTTTTAAA TAGGGAATAT GATAGCTTCG 1680 

CATGGTGGTG TGCACCTATA GCCCCCACTG CCTGGAAAGC TGAGGTGGGA GAATCGCTTG 1740 

AGTCCAGGAG TTTGAGGTTA CAGTGATCCA CGATCGTACC ACTACACTCC AGCCTGGGCA 1800 

ACAGAGCAAG ACCCTGTCTC AAAGCATAAA ATGGAATAAC ATATCAAATG AAACAGGGAA I860 

AATGAAGCTG ACAATTTATG GAAGCCAGGG CTTGTCACAG TCTCTACTGT TATTATGCAT 1920 

TACCTGGGAA TTTATATAAG CCCTTAATAA TARTGCCAAT GAACATCTCA TGTGTGCTCA 1980 

CAATGTTCTG GCACTATTAT AAGTGCTTCA CAGGTTTTAT GTGTTCTTCG TAACTTTATG 2040 

GAGTAGGTAC CATTTGTGTC TCTTTATTAT AAGTGAGAGA AATGAAGTTT ATATTATCAA 2100 

GGGGACTAAA GTCACACGGC TTGTGGGCAC TGTGCCAAGA TTTAAAATTA AATTTGATGG 2160 

TTGAATACAG TTACTTAATG ACCATGTTAT ATTGCTTCCT GTGTAACATC TGCCATTTAT 2220 

TTCCTCAGCT GTACAAATCC TCTGTTTXCT CTCTGTTACA CACTAACATC AATGGCTTTG 2280 

TACTTGTGAT GAGAGATAAC CTTGCCCTAG TTGTGGGCAA CACATGCAGA ATAATCCTGT 2340 

TTTACAGCTG CCTTTCGTGA TCTTATTGCT T GC TTTTCTC CAGATTCAGG GAGAATGTTG 2400 

TTGTCTATTT GTCTCTTACA TCTCCTTGAT CATGTCTTCA TTTTTTAATG TGCTCTGTAC 2460 

CTGTCAAAAA TTTTG AATCT ACACCACATG CTATTGTCTG AACTTGAGTA TAAGATAAAA 2520 
TAAAATTTTA TTTTAAATTT T 
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SEP ID NO:30 PAA7 PROTEIN SEQUENCE 

Protein Accession*: NP_l 10401 

5 i u 21 31 41 51 

I I I I I I 

MSSCNFTHAT PVLIGIPGLE KAHFWVGFPL LSMYWAMFG NCIWFXVRT ERSLHAPMYL 60 

PLCMLAAIDL ALSTSTHPKI LALFWFDSRE ISFEACLTQM FFIHALSAIE STILLAMAFD 120 

RYVAICHPLR HAAVLNNTVT AQIGIVAWR GSLFFPPLPL LIKRLAFCHS NVLSHSYCVH 180 

QBVKKLAYAD TLFNWYGLT AILLVKGVDV MFISLSYFLI XRTVLQLPSR SERAKAFGTC 240 

VSHIGWLAF YVPLIGLSW HRFGNSLHPI VRWHGDIYL LLPPVINPII YGAKTKQIRT 300 
RVLAMPKISC DKDLQAVGGK 
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- - SEQ ID N0:31 PAV6 DNA SEQUENCE 

15 NudeicAddAccession*: XM_050837 

Coding sequence: M 020 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

ATGAACTGGG AGCTGCTGCT GTGGCTGCTG GTGCTGTGCG CGCTGCTCCT GCTCTTGGTG 60 

CAGCTGCTGC GCTTCCTGAG GGCTGACGGC GACCTGACGC TACTATGGGC CGAGTGGCAG 120 

GGACGACGCC CAGAATGGGA GCTGACTGAT ATGGTGGTGT GGGTGACTGG AGCCTCGAGT 180 

GGAATTGGTG AGGAGCTGGC TTACCAGTTG TCTAAACTAG GAGTTTCTCT TGTGCTGTCA 240 

GCCAGAAGAG TGCATGAGCT GGAAAGGGTG AAAAGAAGAT GCCTAGAGAA TGGCAATTTA 300 

AAAGAAAAAG ATATACTTGT TTTCCOCCTT GACCTGACCG ACACTGGTTC CCATGAAGCG 360 

GCTACCAAAG CTGTTCTCCA GGAGTTTGGT AGAATCGACA TTCTGGTCAA CAATGGTGGA 420 

ATGTCOCAGC GTTCTCTGTG CATGGATACC AGCTTGGATG TCTACAGAAA GCTAATAGAG 480 

CTTAACTACT TAGGGACGGT GTCCTTGACA AAATGTGTTC TGCCTCACAT GATCGAGAGG 540 

AAGCAAGGAA AGATTGTTAC TGTGAATAGC ATCCTGGGTA TCATATCTGT ACCTCTPTCC 600 

ATTGGATACT GTGCTAGCAA GCATGCTCTC CGGGGTTTTT TTAATGGCCT TCGAACAGAA 660 

CTTGGCACAT ACCCAGGTAT AATAGTTTCT AACATTTGCC CAGGACCTGT GCAATCAAAT 720 

ATTGTGGAGA ATTCCCTAGC TGGAGAAGTC ACAAAGACTA TAGGCAATAA TGGAGACCAG 780 

TCCCACAAGA TGACAACCAG TCGTTGTGTG CGGCTGATGT TAATCAGCAT GGCCAATGAT 840 

TTGAAAGAAG TTTGGATCTC AGAACAACCT TTCTTGTTAG TAACATATTT GTGGCAATAC 900 

ATGCCAACCT GGGCCTGGTG GATAACCAAC AAGATGGGGA AGAAAAGGAT TGAGAACTTT 960 
AAGAGTGGTG TGGATGCAGA CTCTTCTTAT TTTAAAATCT TTAAGACARA ACATGA CTGA 

SEQ ID Nft32 PAV6 Protein seouence 
Protein Accession #: XP_050837 

1 11 21 31 41 51 

I I I I I I 

MNWEUiLWLL VLCALLLLLV QLLRFLRADG DLTLLWAEWQ GRRPEWELTD MWWVTGASS 60 

GIGEELAYQL. SKLGVSLVLS ARRVHELERV KRRCLENGNL KEKDILVLPL DLTDTGSHEA 120 

ATKAVLQEFG RIDILVNNGG MSQRSLCHDT SLDVYRKLIE I27YLGTVSLT KCVLPHMIBR 180 

KQGKTVTVNS ILGIISVPLS IGYCASKHAL RGFFNGLRTE LATYPGIIVS NICPGPVQSN 240 

IVENSLAGEV TKTIGNNGDQ SHKMTTSFCV RLMLISMAND LKEVWISEQP FLLVTYLWQY 300 
MPTWAWWTIN KMOKKRIENF KSGVD ADSSY FKIFKTKHD 

50 SEQ 10 N033 PBA6 DNA SEQUENCE 

Nudete Add Accession #: NMJ006853 

Coding s equ e n ce : 28-674 (underlined sequences correspond to start and stop codons) 

l 11 21 31 41 51 

55 | | | | | ) 

AGGAATCTGC GCTCGGGTTC CGCAGATGCA GAGGTTGAGG TGGCTGCGGG ACTGGAAGTC 60 

ATCGGGCAGA GGTCTCACAG CAGCCAAGGA ACCTGGGGCC CGCTCCTCCC CCCTCCAGGC 120 

CATGAGGATT CTGCAGTTAA TCCTGCTTGC TCTGGCAACA GGGCTTGTAG GGGGAGAGAC 180 

CAGGATCATC AAGGGGTTCG AGTGCAAGCC TCACTCCCAG CCCTGGCAGG CAGCCCTGTT 240 

CGAGAAGACG CGGCTACTCT GTGGGGCGAC GCTCATCGCC CCCAGATGGC TCCTGACAOC 300 

AGCCCACTGC CTCAAGCCCC GCTACATAGT TCACCTGGGG CAGCACAACC TCCAGAAGGA 360 

GGAGGGCTGT GAGCAGACCC GGACAGCCAC TGAGTCCTTC CCCCACCCCG GCTTCAACAA 420 

CAGCCTCCCC AACAAAGACC ACCGCAATGA CATCATGCTG GTGAAGATGG CATCGCCAGT 480 

CTCCATCACC TGGGCTGTGC GACCCCTCAC CCTCTCCTCA CGCTGTGTCA CTGCTGGCAC 540 

CAGCTGCCTC ATTTCCGGCT GGGGCAGCAC GTCCAGCCCC CAGTTACGCC TGCCTCACAC 600 

CTTGCGATGC GCCAACATCA CCATCATTGA GCACCAGAAG TGTGAGAACG CCTACCCCGG 660 

CAACATCACA GACACCATGG TGTGTGCCAG CGTGCAGGAA GGGGGCAAGG ACTCCTGCCA 720 

GGGTGACTCC GGGGGCCCTC TGGTCTGTAA CCAGTCTCTT CAAGGCATTA TCTCCTGGGG 780 

CCAGGATCCG TGTGCGATCA CCCGAAAGCC TGGTGTCTAC ACGAAAGTCT GCAAATATGT 840 

GGACTGGATC CAGGAGACGA TGAAGAACAA TTAGACTGGA CCCACCCACC ACAGCCCATC 900 

ACCCTCCATT TCCACTTGGT GTTTGGTTCC TGTTCACTCT GTTAATAAGA AACCCTAAGC 960 

CAAGACCCTC TACGAACATT CTTTGGGCCT CCTGGACTAC AGGAGATGCT GTCACTTAAT 1020 

AATCAACCTG GGGTTCGAAA TCAGTGAGAC CTGGATTCAA ATTCTGCCTT GAAATATTGT 1080 

GACTCTGGGA ATGACAACAC CTGGTTTGTT CTCTGTTGTA TCCCCAGCCC CAAAGACAGC 1140 
TCCTGGCCAT ATATCAAGGT TTCAATAAAT ATTTGCTAAA TGAGTG 

SEP ID NO:34 PBA6 PROTEIN SEQUENCE 
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1 11 21 31 41 51 

I I I I I I 

MRILQLIUA LATGLVGGET RIIKGPECKP HSQPWQAALF BKTRLLCGAT LIAPRWLLTA 60 

AHCLKFKYXV HLGQHNLQKE EGCEQTRTAT ESFPHPGFNN SLPNKDHRND IMLVKMASPV 120 

SITWAVRPLT LSSRCVTAGT SCLISGWGST SSPQLRLPHT LRCANITIIE HQKCENAYPG 180 

NITDTMVCAS VQEGGKDSCQ GDSGGPLVCN QSLQGIISWG QDPCAITRKP GVYTKVCKYV 240 
DWIQETMKNN 

SEQIDN03SPBC1 DNA SEQUENCE 

Nucleic Add Accession*: NMJXH775 

Coding sequence: 70-972 (underlined sequences correspond to start end stop coders) 

1 11 21 31 41 51 

I I I I I I 

CTAAAGCTCT CTTGCTGCCT AGCCTOCTGC CGGCCTCATC TTCGCCCAGC CAAOOCOGOC 60 

TGGAGCCC TA TGG CCAACTG CGAGTTCAGC CCGGTGTCCG GGGACAAACC CTGCTGCCGG 120 

CTCTCTAGGA GAGCCCAACT CTGTCTTGGC GTCAGTATCC TGGTCCTGAT CCTCGTCGTG 180 

GTGCTCGCGG TGGTCGTCCC GAGGTGGCGC CAGACGTGGA GCGGTCCGGG CACCACCAAG 240 

CGCTTTCOCG AGACCGTCCT GGCGCGATGC GTCAAGTACA CTGAAATTCA TCCTGAGATG 300 

AGACATGTAG ACTGCCAAAG TGTATGGGAT GCTTTCAAGG GTGCATTTAT TTCAAAACAT 360 

CCTTGCAACA TTACTGAAGA AGACTATCAG CCACTAATGA AGTTGGGAAC TCAGACCGTA 420 

CCTTGCAACA AGATTCTTCT TTGGAGCAGA ATAAAAGATC TGGCCCATCA GTTCACACAG 480 

GTCCAGCGGG ACATGTTCAC CCTGGAGGAC ACGCTGCTAG GCTACCTTGC TGATGACCTC 540 

ACATGGTGTG GTGAATTCAA CACTTCCAAA ATAAACTATC AATCTTGCCC AGACTGGAGA 600 

AAGGACTGCA GCAACAACCC TGTTTCAGTA TTCTGGAAAA CGGTTTCCCG CAGGTTTGCA 660 

GAAGCTGCCT GTGATGTGGT CCATGTGATG CTCAATGGAT CCCGCAGTAA AATCTTTGAC 720 

AAAAACAGCA CTTTTGGGAG TGTGGAAGTC CATAATTTGC AACCAGAGAA GGTTCAGACA 780 

CTAGAGGCCT GGGTGATACA TGGTGGAAGA GAAGATTCCA GAGACTTATG CCAGGATCCC 840 

ACCATAAAAG AGCTGGAATC GATTATAAGC AAAAGGAATA TTCAATTTTC CTGCAftGAAT 900 

ATCTACAGAC CTGACAAGTT TCTTCAGTGT GTGAAAAATC CTGAGGATTC ATCTTGCACA 960 

TCTGAGATCT GA GCCAGTCG CTGTGGTTGT TTTAGCTCCT TGACTCCTTG TGGTTTATGT 1020 

CATCATACAT GACTCAGCAT ACCTGCTGGT GCAGAGCTGA AGATTTTGGA GGGTCCTCCA 1080 

CAATAAGGTC AATGCCAGAG ACGGAAGCCT TTTTCCCCAA AGTCTTAAAA TAACTTATAT 1140 

CATCAGCATA CCTTTATTGT GATCTATCAA TAGTCAAGAA AAATTATTGT ATAAGATTAG 1200 
AATGAAAATT GTATGTTAAG TTACTTCCTT TAG 



SEQ!DNOc36P 
Protein Accession #: 



NP.001766 



1 11 21 31 41 51 

I I I I I I 

KANCEFSPVS GDKPCCRLSR RAQLCLGVSI LVLILVWLA VWPRWRQTW SGPGTTKRFP 60 

ETVLARCVKY TEIHPa«HHV DCQSWJDAFK GAPISKHPCN ITEBDYQPLM KLGTQTVPCN 120 

KILLWSRIKD LAHQFTQVQR DMFTLEDTLL GYLADDLTWC GEFNTSKINY QSCPDWRKDC 180 

SNNPVSVFWK TVSRRFAEAA CDWHVMLNG SRSKIFDKNS TFGSVEVHNL QPEKVQTLEA 240 
HVZBGGREDS RDLCQDPTIK BLESZZSICRN IQFSCKNIYR PDKFLQCVKN PEDSSCTSBI 



SEQ ID N037 PBH1 DNA SEQUENCE 

Nucleic Add Accession*: XM.017718 

CocfirtQ sequence 1-3315 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

ATGT CCTTTC GGGCAGCCAG GCTCAGCATG AGGAACAGAA GGAATGACAC TCTGGACAGC 60 

ACCCGGACCC TGTACTCCAG CGCGTCTOGG AGCACAGACT TGTCTTACAG TGAAAGCGAC 120 

TTGGTGAATT TTATTCAAGC AAATTTTAAG AAACGAGAAT GTGTCTTCTT TACCAAAGAT 180 

TCCAAGGCCA CGGAGAATGT GTGCAAGTGT GGCTATGCCC AGAGCCAGCA CATGGAAGGC 240 

ACCCAGATCA ACCAAAGTGA GAAATGGAAC TACAAGAAAC ACACCAAGGA ATTTCCTACC 300 

GACGCCTTTG GGGATATTCA GTTTGAGACA CTGGGGAAGA AAGGGAAGTA TATACGTCTG 360 

TCCTGCGACA CGGACGCGGA AATCCTTTAC GAGCTGCTGA CCCAGCACTG GCACCTGAKA 420 

ACACCCAACC TGGTCATTTC TGTGACCGGG GGCGCCAAGA ACTTCGCCCT GAAGCCGCGC 480 

ATGCGCAAGA TCTTCAGCCG GCTCATCTAC ATCGCGCAGT CCAAAGGTGC TTGGATTCTC 540 

ACGGGAGGCA CCCATTATGG CCTGATGAAG TACATCGGGG AGGTGGTGAG AGATAACACC 600 

ATCAGCAGGA GTTCAGAGGA GAATATTGTG GCCATTGGCA TAGCAGCTTG GGGCATGGTC 660 

TCCAACCGGG ACACCCTCAT CAGGAATTGC GATGCTGAGG GCTATTTTTT AGCCCAGTAC 720 

CTTATGGATG ACTTCACAAG AGATCCACTG TATATCCTGG ACAACAACCA CACACATTTG 780 

CTGCTCGTGG ACAATGGCTG TCATGGACAT CCCACTGTCG AAGCAAAGCT CCGGAATCAG 840 

CTAGAGAAGT ATATCTCTGA GCGCACTATT CAAGATTCCA ACTATGGTGG CAAGATCCCC 900 

ATTGTGTGTT TTGCCCAAGG AGGTGGAAAA GAGACTTTGA AAGCCATCAA TACCTCCATC 960 

AAAAATAAAA TTCCTTGTGT GGTGGTGGAA GGCTCGGGCC AGATCGCTGA TGTGATCGCT 1020 

AGCCTGGTGG AGGTGGAGGA TGCCCTGACA TCTTCTGCCG TCAAGGAGAA GCTGGTGCGC 1080 

TTTTTACCCC GCACGGTGTC CCGGCTGCCT GAGGAGGAGA CTGAGAGTTG GATCAAATGG 1140 

CTCAAAGAAA TTCTCGAATG TTCTCACCTA /TTAACAGTTA TTAAAATGGA AGAAGCTGGG 1200 

GATGAAATTG TGAGCAATGC CATCTCCTAC GCTCTATACA AAGCCTTCAG CACCAGTGAG 1260 

CAAGACAAGG ATAACTGGAA TGGGCAGCTG AAGCTTCTGC TGGAGTGGAA CCAGCTGGAC 1320 

TTAGCCAATG ATGAGATTTT CACCAATGAC OGOCGATGGG AGTCTGCTGA CCTTCAAGAA 1380 

GTCATGTTTA CGGCTCTCAT AAAGGACAGA CCCAAGTTTG TCCGCCTCTT TCTGGAGAAT 1440 

GGCTTGAACC TACGGAAGTT TCTCACCCAT GATGTCCTCA CTGAACTCTT CTCCAACCAC 1500 

TTCAGCACGC TTGTGTACCG GAATCTGCAG ATCGCCAAGA ATTCCTATAA TGATGCCCTC 1560 
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CTCACGTTTG TCTGGAAACT GGTTGCGAAC TTCCGAAGA6 GCTTCCGGAA GGAAGACAGA 1620 

AATGGCCGGG ACGAGATGGA CATAGAACTC CACGACGTGT CTCCTATTAC TCGGCACCCC 1680 

CTGCAAGCTC TCTTCATCTG GGCCATTCTT CAGAATAAGA AGGAACTCTC CAAAGTCATT 1740 

- TGGGAGCAGA CCAGGGGCTG CACTCTGGCA GCCCTGGGAG CCAGCAAGCT TCTGAAGACT 1800 

D CTGGCCAAAG TGAAGAACGA CATCAATGCT GCTGGGGAGT CCGAGGAGCT GGCTAATGAG 1860 

TACGAGACCC GGGCTGTTGA GCTGTTCACT GAGTGTTACA GCAGCGATGA AGACTTGGCA 1920 

GAACAGCTGC TGGTCTATTC CTGTGAAGCT TGGGGTGGAA GCAACTGTCT GGAGCTGGCG 1980 

GTGGAGGCCA CAGACCAGCA TTTCATCGCC CAGCCTGGGG TCCAGAATTT TCTTTCTAAG 2040 

in CAATGGTATG GAGAGATTTC CCGAGACACC AAGAACTGGA AGATTATCCT GTGTCTGTTT 2100 

IU ATTATACCCT TGGTGGGCTG TGGCTTTGTA TCATTTAGGA AGAAACCTGT CGACAAGCAC 2160 

AAGAAGCTGC TTTGGTACTA TGTGGCGTTC TTCACCTCCC CCTTCGTGGT CITCTCCTGG 2220 

AATGTGGTCT TCTACATCGC CTTCCTCCTG CTGTTTGCCT ACGTGCTGCT CATGGATTTC 2280 

CATTCGGTGC CACACCCCCC CGAGCTGGTC CTOTACTCGC TGGTCTTTGT CCTCTTCTGT 2340 

GATGAAGTGA GACAGTGGTA OGTAAATGGG GTGAATTATT TTACTGACCT GTGGAATGTG 2400 

ATGGACACGC TOGGGCTTTT TTACTTCATA GCAGGAATTG TA1TTCGGCT CCACTCTTCT 2460 

AAIAAAAGCT CTTTGTATTC TGGACGAGTC ATTTTCTGTC TGGACTACAT TATTTTCACT 2520 

CTAAGATTGA TCCACATTTT TACTGTAAGC AGAAACTTAG GACCCAAGAT TATAATGCTG 2580 

CAGAGGATGC TGATCGATGT GTTCTTCTTC CTGTTCCTCT TTGCGGTGTG GATGGTGGCC 2640 

TTTGGCGTGG CCAGGCAAGG GATCCTTAGG CAGAATGAGC AGCGCTGGAG GTGGATATTC 2700 

CGTTCGGTCA TCTACGAGCC CTACCTGGCC ATGTTCGGCC AGGTGCCCAG TGACGTGGAT 2760 

GGTACCACGT ATGACTTTGC CCACTGCACC TTCACTGGGA ATGAGTCCAA GCCACTGTGT 2820 

GTGGAGCTGG ATGAGCACAA CCTGCCCCGG TTCCCCGAQT GGATCACCAT CCCCCTGGTG 2880 

TGCATCTACA TGTTATCCAC CAACATCCTO CTGGTCAACC TGCTGGTCGC CATGTTTGGC 2940 

TACAGGGTGG GCACCGTCCA GGAGAACAAT GACCAGGTCT GGAAGTTCCA GAGGTACTTC 3000 

CTGGTGCAGG AGTACTGCAG CCGCCTCAAT ATCCCCTTCC OCTTCATCGT CTTCGCTTAC 3060 

TTCTACATGG TGGTGAAGAA GTGCTTCAAG TGTTCCTGCA AGGAGAAAAA CATGGAGTCT 3120 

TCTGTCTGCT GTTTCAAAAA TGAAGACRAT GAGACTCTGG CATGGGAGGG TGTCATGAAG 3180 

GAAAACTACC TTGTCAAGAT CAACACAAAA GCCAACGACA CCTCAGAGGA AATGAGGCAT 3240 

CGATTTAGAC AACTGGATAC AAAGCTTAAT GATCTCAAGG GTCTTCTGAA AGAGATTGCT 3300 
AATAAAATCA AATGA 
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SEQPW0t38PBH1 Protein seouenca 
Protein Accession t XP_017718 

1 11 21 31 41 51 

I I I I I ! 

HSPRAARLSM RNRHNDTLDS TRTLYSSASR STDLSYSBSD LVNFIQANFK KRECVFFTKD 60 

SKATENVCKC GYAQSQHMEG TQINQSEKWN YKKHTKEFPT DAPGDIQFET LGKKGKYIRL 120 

SCDTDAEILY ELLTQHWHLK TPNLVISVTG GAKNFALKPR MRKIFSRLIY IAQSKGAWIL 180 

TGGTHYGLMK YIGEWRDNT ISRSSEENIV AIGIAAWGMV SNRDTLIRNC DAEGYFLAQY 240 

tMDDFTODPL YILDNNHTHL LLVDNGCHGH PTVEAKLRNQ LEKYISERTI QOSNYGGKIP 300 

IVCFAQGGGK ETLKAHJTSI KNKIPCWVE GSGQIAWIA SLVEVEDALT SSAVKBKLVR 360 

FLPRTVSRLP EEETESWIKW LKEILECSHL LTVXKHEBAG DEIVSNAISY ALYKAFSTSE 420 

QDKDNWNGQL KLLLEWNQLD LANDEIFTND RRWESADLQE VMPTALIKDR PKPVRLFLEN 480 

GLNLRKFLTH DVLTELFSNH FSTLVYRNLQ IAKNSYNDAL LTFVWKLVAN FRRGFRKEDR 540 

KGRDEMDIEL HDVSPITRHP LQALFIWAIL QNKKELSKVI WEQTRGCTLA ALGASKLLKT 600 

XAKVKNDZNA AGESEELANE YETRAVELFT ECYSSDEDLA' EQLLVYSCEA WGGSHCLBLA 660 

VEATDQHFIA QPGVQNFLSK QWYGEISRDT KMWKIILCLP IIPLVGCGFV SPRKKPVDKH 720 

KKLLWYYVAF PTSPFWFSW NWFYIAFLL LFAYVLLMDF HSVPHPPELV LYSLVFVLFC 780 

PEVRQWYVNG VNYFTDLWNV MDTLGLFYFI AGIVPRLHSS NKSSLYSGRV IFCLDYIIFT 840 

LRLIHTFTVS RNLGFKZZHL QRMLIDVFFF LFLFAVWMVA FGVARQGILR QNEQRWRWIF 900 

RSVIYEPYLA MFGQVPSDVD GTTYCFAHCT FTGNESKPLC VELDEHNLPR FPEWITIPLV 960 

CTYMLSTNIL LVNLLVAMFG YTVGTVQENN DQVWKFQRYF LVQBYCSRUI IPFPFIVFAY 1020 

PYHWKKCFK CCCKEKNMES SVCCFKNEDN ETLAMEGVHK EWYLVKINTK ANDTSEEMRH 1080 
RFRQLDTKLN DLKGLLKEIA NKIK ' 

SEQIDN&39 P8H3DNA SEQUENCE 

Nucleic Acid Accession #: XM_011804 

Coding sequence: 1-558 (underlined sequences correspond to start end stop coders) 



1 11 21 31 41 51 

I I I I I I 

ATGCCTCGCC TGTTCTTGTT CCACCTGCTA GAATTCTGTT TACTACTGAA CCAATTTTCC 60 

, _ AGAGCAGTC6 CGGCCAAATG GAAGGACGAT GTTATTAAAT TATGCGGCCG CGAATTAGTT 120 

05 CGCGCGCAGA TTGCCATTTG CGGCATGAGC ACCTGGAGCA AAAGGTCTCT GAGCCAGGAA 180 

GATGCTCCTC AGACACCTAG ACCAGTGGCA GAAATTGTAC CATCCTTCAT CAACAAAGAT 240 

ACAGAAACTA TAATTATCAT GTTGGAATTC ATTGCTAATT TGCCACCGGA GCTGAAGGCA 300 

GCCCTATCTG AGAGGCAACC ATCATTACCA GAGCTACAGC AGTATGTACC TGCATTAAAG 360 

„ A GATTCCAATC TTAGCTTTGA AGAATTTAAG AAACTTATTC GCAATAGGCA AAGTGAAGCC 420 

71) GCAGACAGCA ATCCTTCAGA ATTAAAATAC TTAGGCTTGG ATACTCATTC TCAAAAAAAG 480 

AGACGACCCT ACGTGGCACT GTTTGAGAAA TGTTGCCTAA TTGGTTGTAC CAAAAGGTCT 540 
CTTGCTAAAT ATTG CTGA 

SEP ID NO:40 PBH3 PROTEIN SEQUENCE 

75 Protein Accession «: KPJ08842 

1 11 21 31 41 51 

I I I I I I 

KPRLFLFHLL EFCLLLNQFS RAVAAKWKDD VIKLCGRELV RAQIAICGMS TWSKRSLSQE 60 
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DAPQTPRPVA EIVPSFZNKD TETIITMT.EF IANLPPELKA ALSERQPSLP ELQQYVPALK 120 

DSNLSFEEFK KLIRNRQSEA ADSNPSELKY LGLDTHSQKK RRPYVALFEK CCLIGCTKRS 180 

LAKYC 

SEQIDN0:41 PBH5DNA SEQUENCE 

Nucleic Add Accession* NMJW5845 

Coding sequence: 1-3978 (underlined sequences correspond to start and stop codons) 

l 11 21 31 41 51 

I I I I I I 

ATGCTGCCCG TGTACCAGGA GGTGAAGCCC AACCCGCTGC AGGACGCGAA CCTCTGCTCA 60 

C6CGTGTTCT TCTGGTGGCT CAATCCCTTQ TTTAAAATTG GCCATAAACG GAGATTAGAG 120 

GAAGATGATA TGTATTCAGT GCTGCCAGAA GACOGCTCAC AGCACCTTGG AGAGGAGTTG 180 

CAAGGGTTGT GGGATAAAGA AGTTTTAAGA GCTGAGAATG ACGCACAGAA GCCTTCTTTA 240 

ACAAGAGCAA TCATAAAGTG TTACTGGAAA TCTTATTTAG TTTTGGGAAT TTTTACGTTA 300 

ATTGAGGAAA GTGCCAAAGT AATCCAGCCC ATATTTTTGG GAAAAATTAT TAATTATTTT 360 

GAAAATTATG ATCCCATGGA TTCTGTGGCT TTGAACACAG CGTACGCCTA TGCCACGGTG 420 

CTGACTTTTT GCACGCTCAT TTTGGCTATA CTGCATCACT TATATTTTTA TCACGTTCAG 480 

TGTGCTGGGA ♦DGAGGTTACG AGTAGCCATG TGCCATATGA TTTATCGGAA GGCACTTCGT 540 

CTTAGTAACA TGGCCATGGG GAAGACAACC ACAGGCCAGA TAGTCAATCT GCTGTCCAAT 600 

GATGTGAACA AGTTTGATCA GGTGACAGTG TTCTTACACT TCCTGTGGGC AGGACCACTG 660 

CAGGCGATCG CAGTGACTGC CCTACTCTGG ATGGAGATAG GAATATCGTG CCTTGCTGGG 720 

ATGGCAGTTC TAATCATTCT CCTGCCCTTG CAAAGCTGTT TTGGGAAGTT GTTCTCATCA 7B0 

CTGAGGAGTA AAACTGCAAC TTTCACGGAT GCCAGGATCA GGACCATGAA TGAAGTTATA 840 

ACTGGTATAA GGATAATAAA AATGTACGCC TGGGAAAAGT CATTTTCAAA TCTTATTACC 900 

AATTTGAGAA AGAAGGAGAT TTCCAAGATT CTGAGAAGTT CCTGCCTCAG GGGGATGAAT 960 

TTGGCTTCGT TTTTCAGTGC AAGCAAAATC ATCGTGTTTG TGACCTTCAC CACCTACGTG 1020 

CTCCTCGGCA GTGTGATCAC AGCCAGCCGC GTGTTCGTGG CAGTGACGCT QTATGGGGCT 1080 

GTGCGGCTGA OGGTTAOCCT CTTCTTCCCC TCAGCCATTG AGAGGGTGTC AGAGGCAATC 1140 

GTCAGCATCC GAAGAATCCA GACCTTTTTG CTACTTGATG AGATATCACA GCGCAACCGT 1200 

CAGCTGCCGT CAGATGGTAA AAAGATGGTG CATGTGCAGG ATTTTACTGC • TTTTTGGGAT 1260 

AAGGCATCAG AGACCCCAAC TCTACAAGGC CTTTCCTTTA CTGTCAGACC TGGCGAATTG 1320 

TTAGCTGTGG TCGGCCCCGT GGGAGCAGGG AAGTCATCAC TGTTAAGTGC CGTGCTCGGG 1380 

GAATTGGCCC CAAGTCACGG GCTGGTCAGC GTGCATGGAA GAATTGCCTA TGTGTCTCAG 1440 

CAGCCCTGGG TGTTCTCGGG AACTCTGAGG AGTAATATTT TATTTGGGAA GAAATACGAA 1500 

AAGGAACGAT ATGAAAAAGT CATAAAGGCT TGTGCTCTGA AAAAGGATTT ACAGCTGTTG 1560 

GAGGATGGTG ATCTGACTGT GATAGGAGAT CGGGGAACCA CGCTGAGTGG AGGGCAGAAA 1620 

GCACGGGTAA ACCTTGCAAG AGCAGTGTAT CAAGATGCTG ACATCTATCT CCTGGACGAT 1680 

CCTCTCAGTG CAGTAGATGC GGAAGTTAGC AGACACTTGT TCGAACTGTG TATTTGTCAA 1740 

ATTTTGCATG AGAAGATCAC AATTTTAGTG ACTCATCAGT TGCAGTACCT CAAAGCTGCA 1800 

AGTCAGATTC TGATATTGAA AGATGGTAAA ATGGTGCAGA AGGGGACTTA CACTGAGTTC 1860 

CTAAAATCTG GTATAGATTT TGGCTCCCTT TTAAAGAAGG ATAATGAGGA AAGTGAACAA 1920 

CCTCCAGTTC CAGGAACTCC CACACTAAGG AATCGTACCT TCTCAGAGTC TTCGGTTTGG 1980 

TCTCAACAAT CTTCTAGACC CTCCTTGAAA GATGGTGCTC TGGAGAGCCA AGATACAGAG 2040 

AATGTCCCAG TTACACTATC AGAGGAGAAC CGTTCTGAAG GAAAAGTTGG TTTTCAGGCC 2100 

TATAAGAATT ACTTCAGAGC TGGTGCTCAC TGGATTGTCT TCATTTTCCT TATTCTCCTA 2160 

AACACTGCAG CTCAGGTTGC CTATGTGCTT CAAGATTGGT GGCTTTCATA CTGGGCAAAC 2220 

AAACAAAGTA TGCTAAATGT CACTGTAAAT GGAGGAGGAA ATCTAACCGA GAAGCTAGAT 2280 

CTTAACTGGT ACTTAGGAAT TTATTCAGGT TTAACTGTAG CTACCGTTCT TTTTGGCATA 2340 

GCAAGATCTC TATTGGTATT CTACGTCCTT GTTAACTCTT CACAAACTTT GCACAACAAA 2400 

ATGTTTGAGT CAATTCTGAA AGCTCCGGTA TTATTCTTTG ATAGAAATCC AATAGGAAGA 2460 

ATTTTAAATC GTTTCTCCAA AGACATTGGA CACTTGGATG ATTTGCTGCC GCTGACGTTT 2520 

TTAGATTTCA TCCAGACATT GCTACAAGTG GTTGGTGTGG TCTCTGTGGC TGTGGCCGTG 2580 

ATTCCTTGGA TCGCAATACC CTTGGTTCCC CTTGGAATCA TTTTCATTTT TCTTCGGCGA 2640 

TATTTTTTGG AAACGTCAAG AGATGTGAAG CGCCTGGAAT CTACAACTCG GAGTCCAGTG 2700 

TTTTCCCACT TGTCATCTTC TCTCCAGGGG CTCTGGACCA TCCGGGCATA CAAAGCAGAA 2760 

GAGAGGTGTC AGGAACTGTT TGATGCACAC CAGGATTTAC ATTCAGAGGC TTG GTT CT TG 2820 

TTTTTGACAA CGTCCCGCTG GTTCGCCGTC CGTCTGGATG CCATCTGTGC CATGTTTGTC 2680 

ATCATCGTTG CCTTTGGGTC CCTGATTCTG GCAAAAACTC TGGATGCCGG GCAGGTTGGT 2940 

TTGGCACTGT CCTATGCCCT CACGCTCATG GGGATGTTTC AGTGGTGTGT TCGACAAAGT 3000 

GCTGAAGTTG AGAATATGAT GATCTCAGTA GAAAGGGTCA TTGAATACAC AGACCTTGAA 3060 

AAAGAAGCAC CTTGGGAATA TCAGAAACGC CCACCACCAG CCTGGCCCCA TGAAGGAGTG 3120 

ATAATCTTTG ACAATGTGAA CTTCATGTAC AGTCCAGGTG GGCCTCTGGT ACTGAAGCAT 3180 

CTGACAGCAC TCATTAAATC ACAAGAAAAG GTTGGCATTG TGGGAAGAAC CGGAGCTGGA 3240 

AAAAGTTCCC TCATCTCAGC CCTTTTTAGA TTGTCAGAAC CCGAAGGTAA AATTTGGATT 3300 

GATAAGATCT TGACAACTGA AATTGGACTT CACGATTTAA GGAAGAAAAT GTCAATCATA 3360 

CCTCAGGAAC CTGTTTTGTT CACTGGAACA ATGAGGAAAA ACCTGGATCC CTTTAATGAG 3420 

CACACGGATG AGGAACTGTG GAATGCCTTA CAAGAGGTAC AACTTAAAGA AACCATTGAA 3480 

GATCTTCCTG GTAAAATGGA TACTGAATTA GCAGAATCAG GATCCAATTT TAGT6TTGGA 3540 

CAAAGACAAC TGGTGTGCCT TGCCAGGGCA ATTCTCAGGA AAAATCAGAT ATTGATTATT 3600 

GATGAAGCGA CGGCAAATGT GGATCCAAGA ACTGATGAGT TAATACAAAA AAAAATCCGG 3660 

GAGAAATTTG CCCACTGCAC CGTGCTAACC ATTGCACACA GATTGAACAC CATTATTGAC 3720 

AGCGACAAGA TAATGGTTTT AGATTCAGGA AGACTGAAAG AATATGATGA GCCGTATGTT 3780 

TTGCTGCAAA ATAAAGAGAG CCTATTTTAC AAGATGGTGC AACAACTGGG CAAGGCAGAA 3840 

GCCGCTGCCC TCACTGAAAC AGCAAAACAG GTATACTTCA AAAGAAATTA TCCACATATT 3900 

GGTCACACTG ACCACATGGT TACAAACACT TCCAATGGAC AGCCCTCGAC CTTAACTATT 3960 
TTCGAGACAG CACTGTGA 
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SEP ID NO:42 PBH5 PROTEIN SEQUENCE 

Proton Accession #: NP_005836 

1 11 21 31 41 51 

I I I 1 I I 

MLPVYQEVKP NPLQDANLCS RVFFWWLNPL FKIGHKRRLE EDDMYSVLPE DRSQHLGBEL 60 

QGFWDKEVLR AENDAQKPSL TRAIIKCYWK SYLVLGIFTL IEESAKVIQP IFLGKIINYF 120 

ENYDPMDSVA LNTAYAYATV LTFCTLILAI LHHLYFYHVQ CAGMRLRVAH CHMIYRKALR 180 

LSNHRMGKTT TGQXVNLLSN DVNKFDQVTV FLHFLWAGPL QAIAVTALLW MEIGISCLAG 240 

MAVLIILLPL QSCFGKLFSS LRSKTATPTD ARIRTMNEVI TGIRIIKMYA WEKSFSNLIT 300 

NLRKKEISKI LRSSCLRGMN LASFFSASKI IVFVTFTTYV LLGSVITASR VFVAVTLYGA 360 

VRLTVTLFFP SAIERVSEAI VSIRRIQTFL LLDEISQHNR QLPSDGKKMV HVQDFTAFWD 420 

KASETPTLQG LSFTVRFGEL LAWGPVGAG KSSLLSAVLG ELAPSHGLVS VHGRIAYVSQ 480 

QPWVFSGTLR SNILFGKXYE KERYEKVIKA CALKKDLQLL EDGDLTVIGD RGTTLSGGQK 540 

ARVNLARAVY QDADIYLLDD PLSAVDAEVS RHLFELCICQ ILHEKITILV THQLQYLKAA 600 

SQ1LILKDGK HVQKGTYTEF LKSGIDFGSL LKKDNSESEQ PPVPGTPTLR NRTFSESSVW 660 

SQQSSRPSLK DGALESQDTE NVPVTLSEEN RSEGKVGFQA YKNYFRAGAH WTVFIFLILL 720 

HTAAQVAYVL QDWWLSYWAN KQSMLNVTVN GGGNVTEKLD LNWYLGIYSG LTVATVLFGI 780 

ARSLLVFYVL VNSSQTLHNK MFESILKAPV LFFDRNPIGR IU3RFSKDIG HLDDLLPLTF 840 

LDFIQTLLQV VGWSVAVAV IPWIAIPLVP LGIIPIFLRR YFLETSRDVK RLESTTRSFV 900 

FSHLSSSLQG LWTIRAYKAE ERCQELFDAH QDLHSEAWFL FLTTSRWFAV RLDAICAMFV 960 

IIVAFGSLIL AKTLDAGQVG LALSYALTLM GMFQWCVRQS AEVENMMISV ERVTEYTDLE 1020 

KEAPWEYQKR PPPAWPHEGV IIFDNVNFMY SPGGPLVLKH LTALIKSQEK VGIVGRTGAG 1080 

KSSLZSALFR LSEFEGKZHZ DKILTTSIGL HDLRKKMSII PQEPVLFTGT MRKNLDPFNE 1140 

HTDEELWNAL QEVQLKETIE DLPGKMDTEX AESGSNF SVG QRQLVCLARA ILRKNQILII 1200 

DEATANVDPR TDELIQKXIR EKFAHCTVLT IAHRLNTIH) SDKIMVLDSG RLKEYDEPYV 1260 

LLQHKESLFY KMVQQLGKAE AAALTETAKQ VYFKRNYPHI GHTDHMVTNT SNGQPSTLTI 1320 
FETAL 

SEQ ID N0:43 PBQ7 DNA SEQUENCE 
Nucleic Acid Accession t. NM.02I233 , 
Cooing sequence: 34-11 19 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

ATGGGGAAAG TGTCCTGCTG TGGCATGAAA TAAATGAAAC AGAAAATGAT GGCAAGACTG 60 

CTAAGAACAT OCTTTGCTTT GCTCTTCCTT QGCCTCTTTG GGGTGCTGGG GGCAGCAACA 120 

ATTTCATGCA GAAATGAAGA AGGGAAAGCT GTGGACTGGT TTACTTTTTA TAAGTTACCT 180 

AAAAGACAAA ACAAGGAAAG TGGAGAGACT GGGTTAGAGT ACCTGTACCT AGACTCTACA 240 

ACTAGAAGCT GGAGGAAGAG TGAGCAACTA ATGAATGACA CCAAGAGTGT TTTGGGAAGG 300 

ACATTACAAC AGCTATATGA AGCATATGCC TCTAAGAGTA ACAACACAGC CTATCTAATA 360 

TACAATGATG GAGTCCCTAA ACCTGTGAAT TACAGTAGAA AGTATGGACA CACCAAAGGT 420 

TTACTGCTGT GGAACAGAGT TCAAGGGTTC TCGCTGATTC ATTCCATCCC TCAGTTTCCT 480 

CCAATTCCGG AAGAAGGCTA TGATTATCCA CCCACAGGGA GACGAAATGG ACAAAGTGGC 540 

ATCTGCATAA CTTTCAAGTA CAACCAGTAT GAGGCAATAG ATTCTCAGCT CTTGGTCTGC 600 

AACCCCAACG TCTATAGCTG CTCCATCCCA GCCACCTTTC ACCAGGAGCT CATTCACATG 660 

CCCCAGCTGT GCACCAGGGC CAGCTCATCA GAGATTCCTG GCAGGCTCCT CACCACACTT 720 

CAGTCGGCCC AGGGACAAAA ATTCCTCCAT TTTGCAAAGT CGGATTCTTT TCTTGACGAC 780 

ATCTTTGCAG CCTGGATGGC TCAACGGCTG AAGACACACT TGTTAACAGA AACCTGGCAG 840 

CGAAAAAGAC AAGAGCTTCC TTCAAACTGC TCCC T TCCTT ACCATGTCTA CAATATAAAA 900 

GCAATTAAAT TATCACGACA CTCTTATTTC AGTTCTTATC AAGATCACGC CAAGTGGTGT 960 

ATTTCCCAAA AGGGCACCAA AAATCGCTGG ACATGTATTG GAGACCTAAA TCGGAGTCCA 1020 

CACCAAGCCT TCAGAAGTGG AGGATTCATT TGTACCCAGA ATTGGCAAAT TTACCAAGCA 1080 
TTTCAAGGAT TAGTATTATA CTATGAAAGC TGTAAGTAAA CTTGGTGAAA GGACACAGGT 

SEQ ID Mfr44 PBQ7 Protein sequence 
Prtfeto Accession* NP_067056 

1 11 21 31 41 51 

I I I I I I 

MMARLLRTSF ALLFLGLFGV LGAATISCRN EEGKAVDWFT FYKLPKRQNK ESGETGLEYL 60 

YLDSTTRSWR KSEQLHNDTK SVLGRTLQQL YEAYASKSKN TAYLIYNDGV FKPVNYSRKY 120 

GHTKGLLLWN RVQGFWLIHS IPQFPPIPEE GYDYPPTGRR NGQSGICITF KYNQYEAIDS 180 

GLLVCNPNVY SCSIPATFHQ ELIHHPQLCT RASSSEIPGR LLTTLQSAQG QKFLHFAKSD 240 

SPLDDIFAAW MAQRLKTHLL TETWQRKRQE LPSNCSLPYH VYNIKAIKLS RHSYFSSYQD 300 
HAKWCISQKG TKNRWTCK5D LNRSPHQAFR SGGFICTQNW QIYQAPQGLV LYYESCK 

SEQ ID N0:45 PCQ8 DNA SEQUENCE 

Nudeic Acid Accession #: XMJB0453 

Coding sequence: 89-1273 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

CGGTGCCCTG GGGTGGAATA TCCCCTACGA ATTTAACCAA GCGGACTTTA ATGCCACTGT 60 

GCAGTTCATC CAAAACCACT TGGATGACAT GGATGTCAAA AAGGGTGTCT CCTGGACCAC 120 

CATCCGCTAC ATGATAGGAG AGATTCAATA TGGAGGCAGA GTCACTCACG ACTATGATAA 180 

GAGATTGTTG AACACATTTG CTAAGGTTTG GTTCAGTGAA AATATGTTTG GACCAGATTT 240 

CAGTTTTTAC CAAGGATACA ATATTCCAAA ATGCAGCACA GTGGATAACT ATCTTCAGTA 300 

TATCCAGAGT TTGCCTGCCT ATGACAGCCC TGAGGTGTTT GGGCTGCACC CCAATGCTGA 360 
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CATCACCTAC CAGAGCAAGC TGGCCAAGGA CGTGCTGGAC ACCATCCTAG GCATCCAACC 420 

CAAGGACAOC TCTGGTGGAG GGGATGAGAC CCGG6A6GCG GTGGTGGCCC GGCTGGCTGA 480 

TGATATGCTG GAGAAGCTGC CCCCAGACTA TGTCCCCTTT GAAGTAAAAG AGAGGCTCCA 540 

GAAGATGGGG CCATTCCAGC CTATGAACAT TTTCCTCAGG CAGGAAATAG ACAGAATGCA 600 

AAGGGTACTC AGCCTTGTCC GCAGCACCCT CACTGAGCTG AAACTTGCTA TTGATGGCAC 660 

CATCATCATG AGCGAAAATC TGCAAGATGC ATTGGATTGC ATGTTTGATG CTAGAATCCC 720 

TGCTTGGTGG AAAAAAGCTT CTTGGGTTTT TAGTACACTG GGTTTCTGGT TTACTGAACT 780 

TATAGAAAGA AACAGCCAGT TTACCTCGTG GGTTTTCAAT GGCCGACCTC ACTGCTTTTG 840 

GATGACGGGT TTTTTTAACC CCCAGGGATT TTTAACTGCA ATGCGACAGG AAATAACTCG 900 

GGCCAACAAA GGCTGGGCTC TGGACAATAT GGTGCTTTGC AATGAAGTCA CCAAATGGAT 960 ■ 

GAAGGACGAC ATTTCTACCC CTCCCACAGA GGGTGTCTAT GTCTATGGCT TATATCTTGA 1020 

AGGTGCTGGC TGGGACAAGA GGAACATGAA ACTCATTGAA TCAAAGCCAA AAGTGCTCTT 1080 

TGAGTTGATG CCTGTCATAA GGATTTATGC AGAAAACAAT ACTTTACGAG ATCCTCGGTT 1140 

TTACTCCTGT CCCATCTATA AGAAGCCAGT TCGAACGGAC TTGAACTACA TTGCOGCTGT 1200 

GGATCTCAGG ACAGCCCAGA CCCCTOAACA CTGGGTCCTC CGTGGGGTTG CCCTTCTGTG 1260 

TGATGTCAAG TAACATGTGG GGAGTGTCCC CACCCAATGC TTTGGAAAAT GCAAGATCTA 1320 

AATTATTGTA ACCTTTATTT CTGTATGACT GCTGGACAGT GTATGTTAGG TCGTTTATGC 1380 

AATTAATGAG CTGCATAGGT TTTCCCCACT CCTTAATTGG ATGCTTATAT TTTACTTGTT 1440 

TCATCATTAG TGACCAATGT CTGAGTTTGT TGAAAATGTT ATTTAGTGAT ATAAAAGTAA 1500 

ATTTACAGCA TCCTAATGAA GTGTGGCCCT CAAATCCACA GTAGTATATT TTCTTCTTAC 1560 

TTCGCTCCGA AGACTGACTG TGATTATAAC AGCAAATATA TTTGCATGTG GACAAAGATT 1620 

AGATGGCAAG ATAGAAAAAT AAGAACAGAT GTGATAGCAA GAATTATAGT TGGCTTGAAA 1680 

AAATGTGATG ATCAGGAGAA AAAATAAAAA AAGGGTAGAA ATATTAGACG GTGCGTAGGG 1740 

ACTTTCTATG GACTTTTATT AATTAGGAAA CATTATCAAA GGAACTTTTC ACGTATTTTT 1800 

CTTTAAATTC TGGTTAGATG TTATTAATAA TTCTTCATCT AACCTACTGA CTAGAAAATA 1860 

TAGTCAGTAC TAAATTAGAA TTGTGGTTTA TAAACTTTTG GTTAGCTCTG GATCTGTATA 1920 

ACTGCATTTT TTTGGATAAA CAGTTTTTGG TAGGTGGATA CCGGGAGACA AGTGTGGGTC 1980 

CCTCTCACTO GGCTTCATTC TGTGGACCAG GATCATTATT TCATGCTCAT GATCATGAGA 2040 

GTTAGGACTG AGTGGCTCCT GTGACTCCCA CCATCTTAGA TGATACTGTT TTCTTGTGAG 2100 

TTCTTTCTTT TGGTGTGGAT TAGTATATCA GTTGATTTGT GTGAATTGTG GTGAAACAAT 2160 

CATTTCATTT TGAAAAGCAA GTAATGAAAA TGTCAGCATC ATAGGAATTA ATAAAATGTT 2220 
TTTACTAAAA AAAAAAAAAA AAA 

SEQ ID HO^PCOSProtdn sequence 
Protein Accession #: BAB 15543 

1 11 21 31 41 51 

I I I I I I 

MDVKKGVSW? TIRYMIGEIQ YGGRVTDDYD KRLLNTPAKV WFSENMFGPD FSFYQGYNIP 60 

KCSTVDNYLQ YIQSLPAYDS PEVFGLHPNA DITYQSKLAK DVLDTILGIQ PKDTSGGGDE 120 

TREAWARLA DDMLEKLPPD YVPFEVKERL QKMGPFQPMN IFLRQEIDRM QRVLSLVRST 180 

LTELKLAIDG TIIMSENLQD ALDCMFDARI PAWWKKASWV FSTLGFWFTE LIERNSQFTS 240 

WVFKGRPHCF WMTGFFKPQG FLTAHRQEIT RANKGWALDN MVLOJEVTKW MKDDISTPPT 300 

EGVYVYGLYL EGAGWDXRNM KLIESKPKVL FELMPVTRIY AENNTLRDPR FYSCPIYKKF 360 
VRTDLNYIAA VDLRTAQTPE HWVLRG VALL CDVK 

SEQ ID KO-.47 PDG5 DNA SEQUENCE 

Nucleic Add Accession*: AB033036 

Coding sequence: 68-3349 (unoMned sequences correspond to start end stop codcns) 

1 11 21 31 41 51 

I I I I I I 

GGAGCAGCCT ACAACTTCAC AACCAGAAAC CACTACCCCT CAGGGGTPGC TTTCAGATAA 60 

AGATGACATG GGAAGGAGAA ATGCTGGCAT AGATTTCGGA TCCAGAAAAG CATCAGCAGC 120 

ACAGCCCATA CCTGAAAACA TGGACAATTC CATGGTTAGT GATCCACAAC CATACCATGA . 180 

AGATGCAGCT TCTGGAGCTG AGAAGACAGA AGCCAGAGCT TCTCTCTCAC TGATGGTGGA 240 

AAGCCTTTCT ACAACCCAAG AGGAGGCCAT TCTCTCAGTA GCAGCAGAGG CTCAGGTGTT 300 

TATGAATCCT TCTCATATCC AGTTAGAAGA TCAAGAAGCT TTCAGCTTTG ATTTACAAAA 360 

GGCCCAATCC AAAATGGAGT CAGCCCAGGA TGTTCAAACT ATCTGCAAAG AAAAGCCTTC 420 

TGGAAATGTT CACCAGACCT TTACAGCAAG TGTTTTGGGT ATGACAAGTA CTACAGCCAA 480 

AGGAGATGTT TATGCCAAGA CTCTGCCTCC CAGAAGCCTT TTTCAGTCCT CAAGGAAGCC 540 

TGATGCTGAA GAAGTCTCCT CAGATTCAGA GAATATTCCT GAGGAGGGGG ATGGTTCTGA 600 

AGAACTGGCT CATGGTCACT CTTCCCAGTC CTTGGGGAAG TTTGAAGATG AACAAGAAGT 660 

CTTCTCAGAA TCAAAAAGTT TTGTTGAGGA CTTGAGCAGC TCTGAGGAGG AGCTGGACCT 720 

CAGATGCCTC TCCCAGGCTT TAGAGGAGCC TGAAGATGCA GAAGTCTTCA CAGAATCAAG 780 

CAGTTATGTT GAAAAGTACA ACACTTCTGA TGATTGCAGC AGCTCAGAGG AAGACCTGCC 840 

TCTCAGACAC CCTGCTCAGG CCTTGGGAAA GCCCAAAAAC CAACAAGAAG TCTCCTCTGC 900 

TTCAAATAAT ACTCCTGAAG AGCAGAATGA TTTTATGCAG CAGCTGCCTT CCAGATGCCC 960 

TTCTCAGCCC ATTATGAATC CTACTGTTCA GCAACAAGTC CCCACCAGTT CAGTGGGCAC 1020 

TTCTATAAAA CAGAGCGATT CCGTGGAGCC AATCCCTCCA AGACACCCTT TCCAGCCATG 1080 

GG TGAACCCT AAAGTGGAGC AAGAAGTTTC CTCATCTCCA AAGAGCATGG CTGTTGAAGA 1140 

GAGCATTTCT ATGAAGCCTC TGCCTCCTAA ACTTCTTTGC CAGCCCTTGA TGAATCCTAA 1200 

AGTTCAACAA AACATGTTCT CAGGTTCAGA GGACATTGCT GTTGAGAGAG TCATTTCTGT 1260 

GGAGCCACTA CTCCCCAGAT ATTCTCCTCA GTCCTTGACA GATCCTCAAA TCCGGCAAAT 1320 

CTCAGAAAGC ACAGCTGTTG AGGAAGGCAC TTATGTGGAA CCGCTGCCTC CCAGATGCCT 1380 

TTCCCAGCCC TCGGAGAGGC CTAAGTTCCT GGACTCAATG AGTACTTCTG CAGAATGGAG 1440 

CAGTCCTGTG GCACCAACAC CTTCCAAATA CACTTCCCCG CCATGGGTGA CCCCTAAATT 1500 

TGAGGAACTG TATCAACTCT CTGCACATCC AGAAAGCACT ACTGTTGAAG AGGACATTTC 1560 
TAAGGAGCAG CTGCTTCCCA GACATCTTTC CCAGTTGACT GTGGGAAATA AAGTCCAGCA 1620 

ACTGTCCTCA AATTTCGAGC GGGCTGCTAT TGAGGCAGAC ATTTCTGGGA GTCCATTGCC 1680 
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TCCCCAATAT GCTACCCAGT TCTTAAAOAG GTCTAAAGTT CAGGAAATGA CCTCACGACT 1740 

AGAGAAAATG GCTGTTGAAG GCACTTCTAA CAAATCACCG ATTCCCAGGC GTCCGACCCA 1800 

GTCATTCGTG AAATTTATGG CACAGCAAAT CTTTTCAGAG AGCTCTGCTC TTAAGAGGGG 1860 

CAGT6ATGT6 GCACCTCTGC CTCCCAATCT TCCTTCCAAA TCTTTATCAA A6CCT6AAGT 1920 

CAAGCACCAA GTTTTCTCAO ATTCA6G6AG TGCTAATCCT AAG6GAGGCA TTTCTTCAAA I960 

GATGCTACCT ATGAAGCACC CTTTACAGTC CTTGGGGAGG CCTGAA6ACC CACAGAAAQT 2040 

TTTCTCTTAT TCAGAGAGAG CTCCTGGGAA GTGCAGCAGT TTTAAAGAGC AGCTGTCTCC 2100 

CAGGCAGCTT TCCCAGGCCT TGAQGAAACC TGAGTATOAG CAAAAAGTCT CCCCTGTTTC 2160 

TGCCAQTTCT CCTAAAGAGT GGA6GAATTC TAAAAAGCAG CTGCCTCCCA AACATTCTTC 2220 

CCAAGCCTCA GATAGGTCTA AATTCCAGCC ACAGATGTCA TCAAAGGGCC CAGTGAATGT 2280 

ACCTGTAAAG CAGAGCAGCG GTGAGAAGCA CCTGCCTTCA AGTAGTCCTT TCCAGCAACA 2340 

GGTTCATTCA AGTTCTGTGA ATGCTGCTGC TAGGCGATCT GTTTTTGAGA GCAATTCTGA 2400 

CAATTGGTTC CTAGGAAGAG ATGAAGCTTT TGCAATCAAA ACCAAGAAAT TCAGCCAAGG 2460 

TTCCAAAAAC CCCATAAAGA GCATTCCAGC CCCTGCTACC AAACCTGGGA AGTTCACCAT 2520 

TGCTCCTGTC AGGCAAACAT CCACTTCTGG GGGCATTTAC TCTAAGAAAG AAGATCTTGA 2580 

GAGTGGTGAT GGTAATAATA ACCAGCATGC AAACCTATCC AATCAGGATG ATGTTGAAAA 2640 

GCTTTTTGGA GTTCGACTGA AAAGAGCCCC TCCCTCGCAG AAGTATAAGA GTGAGAAACA 2700 

AGATAACTTC ACCCAGCTTG CTTCAGTGCC CTCGGGCCCA ATTTCATCCT CTGTAGGCAG 2760 

GGGACATAAA ATCAGAAGCA CTTCCCAGGG GCTCCTGGAT GCTGCAGGGA ACCTCACCAA 2820 

AATATCTTAC GTTGCAGATA AGCAACAGAG CAGGCCCAAA TCTGAAAGCA TGGCCAAGAA 2880 

GCAACCTGCT TGCAAGACCC CAGGAAAGCC TGCTGGTCAA CAGTCAGATT ATGCTGTCTC 2940 

AGAGCCGGTT TGGATAACTA TGGCAAAGCA GAAGCAGAAG AGTTTCAAGG CCCACATTTC 3000 

TGTGAAAGAG CTGAAAACTA AGAGCAATGC TGGAGCCGAT GCTGAGACTA AGGAGCCTAA 3060 

ATATGAGGGA GCTGGCTCTG CAAATGAAAA CCAACCTAAA AAGATGTTCA CTTCCAGTGT 3120 

CCATAAACAG GAGAAGACAG CACAGATGAA GCCACCTAAG CCTACAAAAT CAGTTGGATT 3180 

TGAAGCTCAG AAGATACTGC AAGTTCCTGC CATGGAAAAA GAAACCAAAC GATCTTCAAC 3240 

TCTCCCAGCC AAGTTCCAGA ACCCAGTTGA GCCAATTGAG CCTGTCTGGT TCTCACTGGC 3300 

CAGGAAGAAA GCCAAAGCAT GGAGCCACAT GGCAGAAATC ACGCAATAAA GAGCTCTTGT 3360 

GTGGAGCATC AGCATTTATT TTATTTAGTT TTTTTTTTTT TTTTTTTTTT GAGACAGAGT 3420 

CTCGCTCTGT TACCCAGATT GGAGTGCAGT GGCGCGATCT CCGCTCACTG CAAGCTCCGC 3480 

CTCCCGGGTT CACGCCACTC TCCCGCCTCA GTCTCCCGAC TAGCTGGGAC TACAGGCGCC 3540 

CGCCATCACG CCCGGCTAAT TTTGTTTTCG TATTTTTAGT AGAGACGGGG TTTCACCATG 3600 

TTGGCCAGGA TGGTCTTGAT CTCCTGACCT OGTGATCCGC CCGCCTCAGC CTCCCAAAAG 3660 

CTGGGATTAC AGGCGTGAGC CACCGCGCCC GGCCAAGCAT CAGCGTTTTA AATGATAATT 3720 

GCTAATAGCT GTATTAATTC TATGTAGTGA TCTTTTTACT GTGACCACTT GTATTAAGCA 3780 

AAATAAGTAT TAAGCAAACT AAGAATTTAT TAAGCAAAAT AAGAATTTAT TAAGCAAAAT 3840 

AGCCTTAGAA ATGCAAATTA AAACATAATT ATTTGAATGA AATAAATGCC ATGAATGCTT 3900 

AACCTTCCAC GTAGTCACTG CCAGCACCCA GAAACCCAGC ATTTCCTCTA TTAAAACTAT 3960 

CGAAAACATT TGCACTGCTG TAAAATTGCA AAATCTTTAA CTTTGGACAA TGTGCTTTAG 4020 

AAGGGAGAAA GCAAAAACAT TTTGTTGGAG CAACTAGAAA ATTGTCATTT CCCTGAACCA 4080 

AATAAAGTAA TTCTAATGGA AACATTCAGA TGATTTGACC TAAAGATTGG CCTTTAGGTT 4140 

TTATGAGCCT AGATAGATGC CGCAATTATT TGGTTGTTGC TCTAAGCTTT GCAAGGGATC 4200 

CTAAAAGAGG CGGTGGAAGT GAAAATTCTG GGTCTCCAAG AAAATTTCTG CACAGCCAGT 4260 

TCTCCAATCA GCCTATCACC CCTTGAAACA TCTTCCCTGT GTCCCTGGGG GCCCCTGATG 4320 

CTTTCTCCTT GGGTGATAGT AACATGCAGA GCACTTACAC AAAGCTCCCT CTTTGGACAT 4380 

ACCOCAOGTC GACCTGTCAC AGGCCTGGCT GTAGCGAGCA OCTCCCTATG ACGCAGAATG 4440 

CTTCTTGGGA ATTATCTTAC TCCTCTGGAG GGTTAGTCCA TCAATGTTTT GC1TCTTGTC 4500 

CCAATACTAC TGTGACCCTC TCTGATCGCA CAGAAATCAC TGCCTATCAC ATATATCCTG 4560 

TTAAGCACTG AAGACCCTAT TGAAATTAGA GTTCTACAGA TGCCAAAAGC TGTACTTTCC 4620 

ATCAGGCAGA TGGCAAGCTT ACTGCCTTGA TGCACATCTG GAGCCACTGG AGCTCCTTCC 4680 

TCTCTGGTTC CAGCATTAAG GTGGAGAACT OCATGTAGCT TCTTGTCCTT TCCCCTCAGC 4740 

TGTCTTTGCT TCACAAGGTT TTAGCCCAAA GCAAGAGTGC AATCCCAAAG CCACAGAGAA 4800 

AT6AACTTTC CGCTACCTGG AAGCTTTAAG TGAGTAAATC AGCTTTTCCC CTCTCATTCC 4860 

TAGAGGCACA CACCTCAAAA GTTACTAGGC TGGAGAGACC CTACCTTCCA GTGACCCACT 4920 

CATCCCCCAG CCAC6GAGAA GAGGGAAGAC CAAAAAGGGA GAGTGAGAAA GAGGATGAGA 4980 

GGGATGGTCA GCTGTQAGGG GAGGGGGCAA GTGGCCCAGC AAATGTTGAT GCCTCCCTTC 5040 

CCATCTTGCC ACAOGGTCTT TTTCTTTTGT AGCACAGOCT CCATTAATAA CTCCTCGGCT 5100 

GAGGATGAAG ATGTAGGCAC CTTTACCCCC AGAGCCAGTT CCTTAATTGG CTGGCTTTCT 5160 

GAGATGCAGA CCACCCTAGA ATCTCATCTA GGTTCACTAG AAGTTAGTTA AATCTTCCTT 5220 

TCTCTGTCTT TCTCTTCATT CCATCCCCCA AACCCACCAA ACACTAAGGG AGAGCTCC CT 5280 

TTGGATGTCT GGGCAGTAAA CCTAGCTCAT TTTTCTAGGA GACCCAGAAG TCACTTCTGA 5340 

GTAGTTATCA CTGTGTCTGC CTCTGTTACA CTGTGCTGCT TTGCTTAAAC AGAAATGCAG 5400 

GCCTGGACAT CTGACTGTGC CTTTATATTC TGAGTGGGGT GCTGCCCCAT GCAAAAAAAT 5460 

CCAGAGAGGT AGTGAGGTGT CAGAGCTAAA CACTTGGTGC TGGGTTTTGT TGATGCTGGT 5520 

ATAATGTGAC ACAGTACAAT TACATGCTAA ATTTTGCATT TTCTCTATAT AACATCTATT 5S80 

TTTCCTGATA CTQTGCCTTT GCCATTTTGA TAATGCTATT TTGATTGAGT GAATTTTATT 5640 

TCCTTTGTAT TCCCATAGTG AACAATATAT TAAGGTAGAT GCCCTTTATC TGGGTACTCC 5700 

TGGTAGATTA GCTGTTACAC CTCCCTTCCC TTTTTTACAG TGAACCTGTA TTCAGTTATT 5760 

GTCACTCTGA GAACTCTCCA ATAACAATTT CTTTTCCACA GTTAACAACA CAGCTGTTAC 5820 

ACCTCCCTTC CTTTTTTTCA CAGTGAACCT GTATTCAGCT ATTCTCACTC TGAGAACTCT 5880 

CCAATAACAA TTTCTTTTCC ACAGTTAACA ACAAAGTTCT GTTTTTAAAT GAA GAGATTA 5940 

AGTTCTTTTT AAATGCCTAA AGGCATATTC TGACAACTTT TCTACTTCTT TAACTTTTTT 6000 
GATTTAAGAT ATATGCAAAG CAAATAAATT CAATAAAGCC T 

1 11 21 31 41 51 

I I I I I I 

EQPTTSQPET TTPQGLLSDK DDKGRRNAGI DFGSRKASAA QPIPENMDNS HVSDPQPYHE 60 
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DAASGAEKTE 
AQSKMBSAQD 



ARASLSLMVE 
VQTICKEKPS 
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RCLSQALEEP 



VNPKVEQBVS 
EPLLPRYSPQ 
SPVAPTPSKY 
LSSNFERAAI 
SFVKFMAQQI 
MLPMKHPLQS 
ASSPKEWRNS 



EDAEVFTESS 
FMQQLPSRCP 
SSPKSHAVEE 
SLTDPQIRQI 
TSPPWVTPKF 
EADISGSPLP 
FSESSALKRG 
LGRPEDPQKV 
KKQLPPKHSS 



SLSTTQEEAI 
GNVHQTPTAS 
ELAHGHSSQS 
SYVEKYNTSD 
SQPIMNPTVQ 
SISMKPLPPK 



EELYQLSAHP 
PQYATQFLKR 
SPVAPLPPNL 
FSYSERAPGK 
QASDRSKFQP 



APVRQTSTSG GTYSKKEDLE 
DNFTQLASVP SGPISSSVGR 
QPACKTPGKP AGQQSDYAVS 
YEGAGSANEN QPKKMFTSSV 
LPAKFQNFVE PIEPVWFSLA RKKAKAWSHM 



LSVAAEAQVF 
VLGMTSTTAK 
LGKFEDEQEV 
DCSSSEEDLP 
QQVPTSSVOT 
LLCQPLMNPK 
YVEPLPPRCL 
ESTTVEEDIS 
SKVQEMTSRL 
PSKSLSKPEV 
CSSFKEQLSP 
QMSSKGPVNV 
AIKTKKFSGG 



GHKIRSTSQG 
EPVWITHAKQ 



LLDAAGNLTK 
KQKSFKAHIS 
PPKPTKSVGF 
AEITQ 



HNPSHXQLED 


QEAFSFDLQK 


120 


GBVYAKTLPP 


RSLFQSSRKP 


180 


FSESKSFVED 


XjSSSEEEUJL 


240 


LRHPAQALGK 


PKNQQEVSSA 


300 


SIKQSDSVEP 


IPPRHPFQPW 


360 


VQQNMF5GSE 


DIAVERVISV 


420 


SQPSERPKFL 


DSMSTSAEWS 


480 


KEQLLPRHLS 


QLTVGNKVQQ 


540 


EKMAVEGTSN 


KSPIPRRPTQ 


600 


KHQVFSDSGS 


ANPKGGISSK 


660 


RQLSQALRKP 


EYEQKVSPVS 


720 


PVKQSSGEKR 


IiPSSSPPQQQ 


780 


SKNPIKSXPA 


PATKPGXFTI 


840 


LFGVRLKRAP 


PSQKYKSEKQ 


900 


I5YVADKQQS 


RPKSESMAKK 


960 


VKELKTKSNA 


GADAETKEPK 


1020 


EAQKILQVPA 


MEKETKRSST 


1080 



SEQ ID NO: 49 PAB7 DNA SEQUENCE 

Nucleic Add Aaasstan* D87742 

Coding sequence: 208-3582 (undefined sequences correspond la start and stop codons) 



1 
I 

GCTTTCCTTT 
AACGCTATAA 
GATGTTAATC 
ATTGAAGAAA 
GCTGCCAAAG 
CCTCTGGCAG 
AAAATTCAGA 
AACGACAACC 
CTCTCAAAAG 
GCTGCTGCAG 
GGGCATAGTG 
TCTTTGCAGC 
ATGTCATCAA 
CTAGATAAGG 
GATACTCGTG 
GCTGCAGTGC 
GCAGAGGAGA 
ATGGAAGAGA 
AATGTGCAGG 
GCCTCAGAAG 
ACAGAAGACA 
GAGCCGGCAA 
TATTTAACIA 
TATGGACTGC 
ATTTTCTTAT 
CAAA3TTCTG 
TCAAATTATG 
AATATGATTC 
AATCAGGAAA 
GAACASAATG 
AAGGATGTEA 
GCTAAGCTTA 
AGGCTTAAGA 
GCTGAGCTCA 
CTTACTCACA 
TTAGAGTGTG 
GCAAATGGAG 
ATGGATGTCT 
CAGCTTAAGC 
TTGGAAGATG 
ACCTTGAGGC 



GCAGATGAAA 
GAAATGGAGG 
GAGAAGAAAG 
GAGAAAAGGG 
ATGCTGCAAG 
CCTCCACGGA 
GGAGAATGCT 
AATCGAAGAG 
CGATGGTCAG 
ACCATGATGA 
GTTAATATGG 
ATGGGAGGCC 
TTTGGGCCTC 



11 
I 

CTAAAGTAGA 
ATGCAAAACG 
TGCAAGTCCC 
GCAAGCAAGA 
■GGGTCAACAC 
ATAAGAAAGC 
CTCCAGAATT 
CTGAGGAACA 
AGGACCATGG 
AACCTGAAGA 
ACAAGAGGGA 
GGTTCCAGAA 
AACTGAAGTC 
TCTTCCGTGC 
TGGCTGAAAA 
TTGATGACAT 
CAGCCACACT 
TGCAACCACT 
TTCCTGAAGA 
TGTCACAGAA 
CTCCTATGGA 
GTGTCACACC 
AGTCGCTAGT 
CATGGAAACC 
GGAGAACTGT 
AGAAGTTGAA 
AACAGAAGAT 
TCTCTGATGA 
TTCTGGATGA 
TCAAGAATCA 
TTTCAATGAA 
GTGAAGAGAA 
AGAAAAAAGA 
GTGAGCAAAT 
AGGATGATAA 
AATCTGAATC 
■AAGTGGGAGG 
CTCGGACACA 
TAAGAGCCTC 
ACCGCAACTC 
AGAAAGTGGA 
TGAGTCAAGA 
AGGCAGTTTC 
ATGAATTACA 
CTCATGAAAA 
AAGCTGCCAA 
AAGAACCTGT 
GAGGTCCTCT 
CCCCTCCA3T 
ATATGCCTAG 
CTGAGGCATC 
ACAGCAGCTC 
CTCCAAAAGG 
CTGTACCACC 
GGCCACTTCC 



21 
I 

AGAGGATGAT 
GTCTAAAGAA 
TGACAGAGCA 
AACTAGTATG 
AGGAGGCAGG 
ACAGAGACCA 
AGGTGAAGTG 
TCTGAAGACC 
GAACACAGAG 
TGACTCGTTC 
GGACTTACTT 
GTACTTTAAT 
AGCGCAGCAG 
TTCTGAGTCA 
TAGAGATCTG 
TCAAGACCTC 
GGTGATGGCA 
GCATGAAGAT 
ACCCACCCAC 
GCCAAATACT 
TGCTATTGAT 
TTTGGAAAAC 
TGCTACATTG 
TGTA3TTATC 
CCTTGTTGTG 
GACTATCATG 
CAAGGAATCA 
AGCAATTAAA 
CACAGCTAAA 
GGACTTGATA 
TGCCTCAGAA 
GGTGAAGTCT 
GCAGTTGCAG 
CAAATCATTT 
TATTAATGCT 
TGAGGGTCAA 
TGACCGGAAT 
GACTGCAATA 
CGTGTCCACT 
ACTACAAGCT 
GATTCTGAAT 
AGAGTATGAA 
GGCTGCAGAG 
GAAGACAGAG 
CTGGCTCAAA 
TTTGAGACAC 
GATTGTAAAA 
GAGCCAGAAT 
GACAGTGGAG 
AAGTGAATTT 
TGGGAAACCC 
AAGAGGCTCT 
GCCCCCTCCT 
ACCCATTCGA 
TCCACCCTTT 



31 

I 

TATCCCTCTG 
AAAAACCCTG 
GTTTTAGGGA 
ATTTTGGATA 
GAACCAAATA 
TTTGAACGAA 
TTTCAGAATA 
TCAGGGCTTG 
AAGTACATGG 
CACTGGACTC 
ATCATAAGCA 
GTCCATGAGC 
GAGAGCCTGC 
CAAATTCTGA 
GG AATGAACG 
ATCTATTTTG 
CCACCTCTAG 
AATTTCTCAC 
TTGGACCAAC 
GAGAAAGACC 
GCAAACAAGC 
GCAATCCTTC 
CCTGATGATG 
ACTGCCTTCT 
AAGGATAGAG 
AAAGAAAATA 
AAGAAACATG 
TATAAGGATA 
AATCTTCGTG 
TCAGAAAACA 
TTTTCAGAGG 
GAATGCCATC 
CAGGAAATCG 
GAGAAGTCTC 
TTGACTAACT 
AATAAAGGTG 
GAGAAGATGA 
TCGGTAGTTG 
AAATGTAACC 
GCCAAAGCTG 
GAGCTCTATC 
CGGCAAGAAA 
GAAGTAAAAA 
CGGTCATTTA 
GCTCGTGCTG 
AAATTATTAG 
CCAATGCCAG 
GGCTCTTTTG 
CCACCCGTGA 
GGATCAGTGG 
TCTCCTTCTG 
TCCCCTACCA 
TTCCCAGGAG 
TATGGAOCAC 
GGCCCTGGTA 



41 

I 

AAGAACTACT 
GGAATCAGGG 
CCATTCATCC 
GTGAAAAAAC 
CAATGGTGGA 
GTGACTTTTC 
AAGATTCTGA 
CAGGGGAGCC 
GCACAGAAAG 
CACATACAAG 
GCTTCTTTAA 
TGGAAGCCTT 
CCTATAATAT 
GCATAGCAGA 
AAAATAACAT 
TCAGGTACAA 
AGGAAGGCTT 
GAGAGAAGAC 
GTGTGATTGG 
TGGACCCAGG 
AACCAGAGAC 
TAATATATTC 
TTCAGCCTGG 
TGGGAATTGC 
TATATCAAGT 
CAGAACTTGT 
TTCAGGAAAC 
AAATCAAGAC 
TTATGCTAGA 
AGAAATCTAT 
TTCAGATTGC 
GGGTTCAAGA 
AAGACTGGAG 
AGAAAGATTT 
GCATTACACA 
GAAATGATTC 
AAAATCAAAT 
AAGAGGATCT 
TGGAAGACCA 
GACTGGAAGA 
AGCAGAAGGA 
GAGAGCACAG 
CTTACAAGCG 
AAAACCAGAT 
CAGAAAGAGC 
AATTAACACA 



51 
I 

AGAGGATGAA 
CAGGCAGTTT 
AGATCCAGAA 
AAGTGAGACT 
AAAAGAACGC 
TGACAGCATA 
TTATCTGAAG 
TGAGGGAGAA 
CCAGGGGTCT 
TGTAGAGCCA 



GCCCATCCCC 
GACCTCTCTC 
ACGGGCCTCT 
ATCCAGGATC 
GGGTACTCGA 
TCCCTCTCAT 
CACCTCAGCT 
TGCGTCCACC 



GCTACAAGAA 
GGAAAAAGTC 
AAAAATGCTT 
ATTTGAAGAG 
GCACTCCACA 
GGGTGGAGCA 
AGCAGAACTT 
GGACACTCAT 
GCCAGTTACA 
AGCCGCCGAA 
ATTCATGTTT 
GCCTGATTTT 
TTCGTTTGCC 
CACGGAACAG 
ACAAAAATTG 
CAGGAAACAA 
ACTTGAAAAA 
ATCTGAGAGA 
AGAGAAGTTA 
ACTTAATGAA 
AGAAAATGCT 
TAAATTACAT 
GGAAGTAGCT 
GTTGAATCTG 
AGATGAATTA 
TAAGCAGATG 
AAAGCTTTTA 
GGTAAAGAAA 
TGAATGCAAA 
GATGGCTTTG 
GCTGTCAGCT 
GAGAATTGAA 
CGCTACCCAT 
TATAGCTGAA 
AAAGATGGCA 
TACACAAAAC 
TGTGAGTGGT 
TGCTACTCTC 
ACCTCATCCT 
TGGTACAGCT 
TGAAGGCAAG 
GAGCACCCCC 
CTGCGGACCT 
ACTAGGCTTA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 



320 



WO 02/30268 PCT/USO 1/32045 



AGAGAATTTG CACCAGGCGT TCCACCAGGA AGACGGQACC TGCCTCTCCA CCCTCGGGOA 3360 

TTTTTACCTG GACACGCACC ATTTAGACCT TTAGGTTCAC TTGGCCCAAG AGAGTACTTT 3420 

ATTCCTGGTA CCCGATTACC ACCCCCAACC CATGGTCCCC AGQAATACCC ACCACCACCT 3480 

GCTGTAAGAG ACTTACTGCC GTCAGGCTCT AGAGATGAGC CTCCACCTGC CTCTCAGAGC 3540 

ACTAGCCAGO ACTGTTCACA GGCTTTAAAA CAQAflCCC AT AA AACTATGA CCTCTGAGGT 3600 

TTCATTGGAA AGAAAGTGTA CT6TGCATTA TCCATTACAG TAAAGGATTT CATTGGCTTC 3660 

AAAATCCAAA AGTTTATTTT AAAAGGTTT6 TTGTTAGAAC TAAGCTGCCT TGGCAGTGTG 3720 

CATTTTTGAG CCAAACAATT CAAAAATGTC ATTTCTTCCC TAAATAAAAA TCACCTTTTA 3780 

AGCTAGAGCG TCCTTACAAC TTTGAAATGT GCAATAAAGA ATACCTGTGT TTTAGCTAAT 3840 

GTAGCATATG TAATTGCAAA ATGATTTAGA ATGTCATGAA AAATATGAAC ATTTCCTGTG 3900 

GAAATGCTTT AAGAACATGT ATTTCCATTA TOCTATTTTT AGTGTACACC AGCTGAATAC 3960 

GGAGCAATGG TGTTTATAAG CGTTTTTTTA AACTATCTGG TCACAAAGAC TGTTACGCTA 4020 

AAAATGTTTA CTAAAAGATC ACTAAACTAT CTCCCCTCTT GCTGAAGTTC TTTGTAGTAA 4080 

TAGCTCATAA AAATTTGTTT ATTAATATTT CCCAAGTGTC TGTTGACTCA TTGGACTGTT 4140 

ATGAGGCTTG TGCCATTTGG GGAACATGTA AACTCACGCT CCCAGAACTG AAGATGGTGG 4200 

CTGGTGGCAC ACTTCCGGCT GCTCCTCCGT CACCTGTGAA CTCTACAAGT GATGTCTTTT 4260 

TATTTCAAAG AAGTTTATTT CCCACTTGTA TAGCATTCAC ATGCTTTCTT TACGATCCTC 4320 

ATTGTCTATT TGAGAATGGT TTTCTGAGAG TGAGTTTACA TTAGTAGCAA GAGTTGTTTG 4380 

ACCTGATGTT CCATTGTTTT TACCATTCCT GTAGAAAAAG GGTGCACAAC AGAAAAATGA 4440 

AAATGATGTG TCATGGCCAT AAAAGTATAG AAATCTTTAA AAATTTTAAA ATGTACAGTC 4500 

CCTTATCTAT CTTTCCCATT CCTTGCCACT GATTTTTGAG GAATATAATA AAAAGATTGG 4560 

AAGAGTATAA TGCCATGAGA AAGAATGATT TAGGACTGTG AGGGTTATAA CATGCCCTAG 4620 

GTCAGCAACC AAGGGTTGAA ATCAGTTCTG TTTTAGGGGG AAATGGGGGG GGCGACAGAT 4660 

ATTATTCCAA AATTAATATT AATTAATATT TAAACGTTGG TGTTTTTATT TAAAAATCAG 4740 

TAACTAACCA TCTGGAATTG CACCATACTT AAAGTCTTAT CCATTACTAC ACTGTCTTTA 4800 

AAACAATGTT TCTTTAAATA CTCTACAACG TTTCTAAGAA CGAACTTCAG ACATTTTAAT 4860 

TACAGTAATA ATAGCACTCC TTTTAAGGAG TTTCAGATCC ACACTAAAAC TAAAATCATA 4920 

AAAGGCTGAT ACTTTTGTTT GCTGCTAGGC TATATTCTTC CATTCTTTGA AGTCCTATGA 4980 

TGTAATATTT TTGAAACCTA GTGTATGTCT TGTCACTGTT GTGATATTTA ATCGATTAAG 5040 

30 AATACCTTGT AAAAAGGAGC AAAAGCTTCA ATGTGAAACA ATTTTCTCTC TTTATACTAA 5100 

ACAACTGAAG ATAGATAGTT TAGAAAGATA AGGACCTTTG AAAGAAGACA ACT CTG TCAA 5160 

AGTTCATAAG GAATATAAAA ATTCTTCAGG AAAAGAGAAT TCAATCTATA TGTCCTCCCG 5220 

TTTAATATCA AGAATAGAAG AAATTAAGAG GAAAACTCCA CAGAAGAGCA TAGGCCACTT 5280 

- TTAGCCATGT AAAAATAAGA TTAAGTCACA AATACAACTT TTGAATTTAC CTGTCAAIAT 5340 

35 CTCTTTAGGA CACAAAACAA TGCTGAAGTT AATATAATTT CTAATTTTAA ATOTCATTTA 5400 

AGTGTAGATT ATGCCATCTA GGAAGGTAAG TAGGAAAGGT AAATTAAATC TATTTTTAAA 5460 

ATTCAAAATA TTAGAGTATT TTTCOCCTCT AAAGCCTTTT TTGGTGATTA 1TCTGTATCT 5520 

GACATAATTG AGAAACTGGT AAGCTGTAAA GATTCCAGTG TAGCTTCTCT GAGAAGTTGT 5580 

GAGCCAGTCC ATAACTGCTT CCTCACATCC ATCTGATTGC ACCATTTCTG CAGCAAACCC 5640 

40 CAAAGCAGGG TGCCAATATG CAGATGGCAT AGGGAGTATC ATCCCTCAGC CAAATCACTT 5700 

TTCCATCTCT AAAGTTTCAT CTATTTTGGA AGTCATCTCC AACTAATTGT GTCTGGATTT 5760 

AGTTGCTAAA ATTGTCTTAT TTATTTATGA AGCAGCAATA TTCAGCCTGA AAGCATTTCT 5820 

GCCATAGTTG TTGTAGTTAT ATCGCCAATG GCTGATTTTT TTCATTGGAA AGTAAATTTA 5880 

AGTAATTCGT GGGATGTGGT ATATTCTGTG TCAACTTCAA GATAATCACT CATTTTCTCG 5940 
TTATATTCAG GTCTGAATTA AAGTTAAGTT AATCAC 
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Protein Accession* BAA13448 



1 11 21 31 41 51 

I I I I I I 

AFLSKVEEDD YPSEELLEDE KAIKAKRSKE KNPGNQGRQF DVNLCVPDRA VLGTIHPDPE 60 

_ _ IEESKQETSM ILDSEKTSET AAKGVNTGGR EPNTOVEKER PLADKKAQRP FERSDFSDSI 120 

55 JdQTPELGEV FQNKDSDYLK HDNFEEHLKT SGLAGBPEGE LSKEDHGNTE KYKGTESQGS 180 

AAAEPEDDSF HWTPHTSVEP GHSDKREDLL IISSFFKEQQ SLQRFQKYFN VHELEALLQE 240 

MSSKLKSAQQ ESLPYNMEKV LDKVFRASES QILSIAEKML DTRVAENRDL GMNENNIPEE 300 

AAVLDDIQDL IYFVKYKHST AEETATLVMA PPLEEGLGGA MEEMQPLHED NFSREKTAEL 360 

, NVQVPEEPTH LDQRVIGDTH ASEVSQKPNT EKDLDPGPVT TEDTPMDAID ANKQFETAAE 420 

60 EPASVTPLEN AILLIYSFMF YLTKSLVATL PDDVQPGPDF YGLPWKPVFI TAFLGIASFA 480 

IFLWRTVLW KDRVYQVTEQ QISEKLKTIH KENTELVQKL SNYEQKIKBS KKKVQETRKQ 540 

NMILSDEAIK YKDKIKTLEK NQEILDDTAK NLRVMLESER EQNVKNQDLI SENKKSIEKL 600 

KDVISMNASE FSEVQIALNE AKLSEEKVKS ECHKVQEENA RLKKKKEQLQ QEIEDWSKLH 660 

, AELSEQIKSF EKSQKDLEVA LTHKDDNINA LTNCITQLNL LECESESEGQ NKGGNDSDEL 720 

05 ANGEVGGDRN EKMKNQIKQM MDVSRTQTAI SWEEDLKLL QLKLRASVST KCNLEDQVKK 780 

LEDDRNSLQA AKAGLEDECK TLRQXVEILN KLYQQKEMAL QKKLSQEEYE RQEREHRLSA 840 

ADEKAVSAAE EVKTYKRRIE EMEDELQKTE RSFKNQIATH EKKAHENWLK ARAAERAIAE 900 

EKREAANLRH KLLELTQKMA MLQEEFVTVK PMPGKPNTQN PPRRGPLSQN GSFGPSPVSG 960 

GECSPPLTVE PPVRPLSATL NRRDHPRSEF GSVDGPLPHP RWSAEASGKP SPSDPGSGTA 1020 

70 TMMNSSSRGS SPTRVLDEGK VNMAPKGPPP FPGVPLMSTP MGGPVPPPIR YGPPPQLCGP 1080 

FGPRPLPPPF GPGMRPPLGL REFAPGVPPG RRDLPLHPRG FLPGHAPFRP LGSLGPREYF 1140 
IPGTRLPPPT HGPQEYPPPP AVRDLLPSGS RDEPPPASQS TSQDCSQALK QSP 

SEQ1D N031 PAB9 DNA SEQUENCE 

75 Nuctec Arid Accession* NMJJ06457 

Codng sequence: 84-1874 (underlined seqjexes correspond to start end slop codons) 

1 11 21 31 41 51 

OA I I I I I J 

OU AGACTGAGGC GGAGGCAGCC CCGCGCCGCG CCGGACOCGA GCATATTTCA TTTTCTGTCA 60 

321 



WO 02/30268 



PCTAJS01/32045 



5 

10 
15 
20 
25 
30 
35 
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TTGGACTTTG 
CTTGGGGTTT 
TAAAAGATGG 
TTGATGGAAT 
GTACAGGCTC 
TTCCTGTTCA 
CTGTGTCCAA 
GTTCTGTGTC 
CCCATGCGAC 
TGTTCGCTGC 
CACTGAGCGC 
GTTCCGAGAC 
AACAGCAAAA 
TACCCACTCA 
CAAGAACTGG 
AACATTTGAA 
CTCCGCAGTT 
CAACCTCTGG 
GATCCACTGG 
CTGGAAGAAT 
TGGGACAAAC 
CAGGGAAACG 
TGGCACTGGG 
TGGCCTACAT 
AATTCTTTGC 
CGTTGAAACA 
GGAACAATGT 
TCTTTGGTAC 
AAGCTCTGGG 
TGGAAGGTCA 
CTGTGAATTT 
AAATTAAAAT 
AGTGGCCCTG 
CATAAAGTAA 
AAATAAGCTT 
AGTGAAGAAT 
TGTTAGGTAG 
AACAGAATTA 
TTAAACAGAG 
GC6C66T6GC 
GAGGTCAGGA 
ACAAAAATTA 
GCACGAGAAT 
CACTCCACCC 
TATTTTTGCC 
TGTGTCATGC 



TTCATTGTGT 
GTAAAGATTT 
AATAGAGGGC 
GGGCGGATCA 
CTACTAAAAA 
GGGAGGCTGA 
CACACCACTG 



AGCCATTAGA 
CCGGCTGCAG 
CGGCAAGGCA 
AAATGCACAA 
TTTGAATATG 
AAAGGGAGAA 
AGTCACTTCC 
TTCACCAAAA 
CACCTCATCA 
ATCTGGACTG 
TGGTAAAACT 
TTCTCAGGAG 
TGGCCCACCA 
CAGTGATGCC 
AACAACTCAG 
AGAATCTGAA 
GGCTTOCTTS 
CAGACCAGGG 
CGTCATCAAG 
CTCAAACAGC 
CCAGCCAAGT 
AACTCCGATG 
GAAATCTTGG 
TGGATTTGTA 
CCCTGAATGT 
AACTTGGCAT 
TTTTCACTTG 
TATATGCCAT 
CTACACCTGG 
GACCTTTTTC 
TTGAAAGTCA 
TACTAATTAA 
AAGGAATAAA 
AGAGACGGTT 
TATAAAAACC 
TTAATTTTAG 
TTATGAGTAA 
TTGTATTTAA 
AATTTTATCA 
TCACGCCTGT 
GTTTGAGATC 
GCCGGACGCA 
CACTTGAACC 
TGGGTGACAG 
TTACAGTGGA 
CAGTAAGAGA 
AGC TTAT AAG 
ATGCTTCATC 
AATTAAATAA 
CAGGTGTGGT 
TGAGGTCAAG 
TACAAAAATG 
GGCAGGAAAA 
CACTCCAGCC 



ACCATGAGCA 
GGCGGTAAGG 
GCCCAGGCAA 
GGAATGACTC 
ACTCTGCAAA 
CCTAAAGAAG 
ACAAACAACA 
GTCACATCCA 
CATGCTTCCC 
CATGCTAATG 
GCAGTTAATG 
CTAGCAGAGG 
AGAAAACACA 
AGCAAGAAGA 
TCTCGCTCTT 
GCCGATAATA 
GTAGCTTCCA 
GTTACCAGCC 
TCACCAAGCT 
GCTACTTACT 
GACCAGGACA 
TGCGCCCATT 
CACCCAGAAG 



GGTCGATGCC 

GAGGATGGTG 
GGATGTGAAT 
CATGACACTT 
TCCAAGAAGG 
ACAGTTCAGG 
TTTTTAGATT 
TTCCAGCTTT 
TGGCATTTAT 
AATTTCCTGA 
AATAAATAAT 
ATCTGCAAAA 
AAAAAAACTA 
GTAATAGGTG 
AATCCCAGCA 
AGCCTGGCCA 
GTGGCACGCG 
CGGGAGGGAG 
AGTGAGACTC 
TCATTCTAGT 
TGTTATATTC 
TCTCAAATTT 
ACCTATATTA 
TTTTGGCCTC 
GGCTCACGCC 
AGATCAAGAT 
AGCTGGGCAT 
TTCTTGAACC 
TGGTGACAGA 



ACTACAGTGT 
ATTTCAACAT 
ATGTAAGAAT 
ATCTTGAAGC 
GAGCATCTGC 
TAGTTAAACC 
TGGCCTACAA 
TCCCATCACC 
CTTCACCCGT 
CCAATCTTAG 
TCCCACGGCA 
GACAGAGAAG 
TTGTGGAGCG 
GACTGATTGA 
TCCGAATCCT 
CAAAGAAGGC 
CACGGAGCAT 
TCACAACTGC 
GGCAACGGCC 
CAGGATCAGT 
CTTTAGTGCA 
GTAACCAGGT 
AATTCAACTG 
GAGCCCTGTA 
AAAGGAAGAT 
TTGTGTGTGT 
AACCCTACTG 
TTCCCATAGA 
GCTTTGTATG 
ACAAGCCCCT 
AGAAGAGAAG 
CAATATTTAT 

AAAAACCAAG 
TATTACTTTT 
TGGACTATTA 
CCAATCTGAA 
GGCAATGAAA 
ATACTTATCT 
TCAGTTTTTA 
CTTTGGGAGG 
ACATGGTGAA 
CCTGTAATCC 
AGGTTGCAGT 
CGTCTCCAAA 
AGGAAAGGAC 
TTTTCTTATT 
TTGCCTTTTA 
GGCAAATTCC 
TCATAGTTTT 
TGTGATCOCA 
CATCCTGGCC 
GGTGGGGCGT 
CAGGAGACGG 
GCAAGACTCC 



stein/ 


-ocesson* 


NP.006448 




1 


11 


l 


1 

MSNYSVSLVG 


1 

PAPWGFRLQG 


61 


HTHLEAQNKI 


KGCTGSLNHT 


121 


NNMAYNKAPR 


PFGSVSSPKV 


181 


ANANLSADQS 


PSALSAGKTA 


241 


KHIVEHYTEF 


YHVPTHSDAS 


301 


DNTKKANNSQ 


EPSPQLASLV 


361 


PSWQRPNQGV 


PSTGRISKSA 


421 


AHCNQVIRGP 


FLVALGKSWH 


481 


RCQRKILGEV 


INALKQTWHV 


541 


CEPPIEAGDM 


FLEALGYTWH 



21 
I 

GKDPNMPLTI 
LQRASAAPKP 
TSIPSPSSAF 
VNVPRQPTVT 
KKRLIEDTED 
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GTCACTGGTT 
GCCTCTGACA 
AGGCGATGTG 
CCAGAATAAG 
TGCACCCAAG 
TGTGCCCATT 
TAAGGCACCA 
ATCGTCTGCC 
GGCTGCCGTC 
TGCTGACCAG 
GCCCACAGTC 
AGGATCCCAG 
CTATACAGAG 
GGATACTGAA 
TGCCCAGATC 
AAATAACTCT 
GCCCGAGAGC 
AGCTGCCTTC 
AAACCAAGGA 
GGCACCAGCC 
AAGAGCTGAG 
CATCAGAGGA 
CGCTCACTGC 
TTGTGAGCTG 
CCTTGGAGAA 
AGCCTGTGGA 
TGAGACTGAT 
AGCTGGTGAC 
CTCAGTGTGT 
GTGTAAGAAA 
GAATTTGAAG 
ATGGAGTTTT 
TCTGAGGAAA 
TCCTGTATTT 
AATTCATCTT 
ATAATTATAC 
ATGCCTTAAA 
TTAAAATAGT 
AAAAATTGCT 
CCAAGGTGGG 
ACCCCATCTC 
CAGCTACTCA 
GAGCCAAGAT 
AAAAAACTTT 
AATAAGATTT 
TCTTCCCCAC 
CTAAAATGTG 
ATTTTTTCCC 
CTCTCTCTTT 
GCACTTTGGG 
AACATGGTGA 
GCCTGTAGTC 
AAGTTGCAGT 
GGCTCTT 



41 



GGCCCAGCTC 
ATCTCTAGTC 
GTTCTCAGCA 
ATTAAGGGTT 
CCTGAGCCGG 
ACATCTCCTG 
CGGCCTTTTG 
TTCACCCCAG 
ACTCCTCCCC 
TCTCCATCTG 
ACCAGCGTGT 
GGTGACAGTA 
TTTTATCATG 
GACTGGCGTC 
ACTGGGACTG 
CAGGAGCCTT 
CTGGACAGCC 
AAGCCTGTAG 
GTAGCTTCCA 
AACTCAGCTT 
CACATTCCAG 
CCATTCTTAG 
AAAAATACAA 
TGCTATGAGA 
GTCATCAATG 
AAGCCCATTC 
TATTATGCCC 
ATGTTCCTGG 
TGTGAAAGTT 
CATGCTCATT 
AGAAAAAGGA 
GAAAAATAAT 
TATTTGGCTT 
TATGCCCATA 
AGAATAAATT 
CTTCTTTCCT 
TTTTATCAAT 
AAATAGGATT 
TGTAGGCTGA 
TGGACCACAT 
TACTAAAAAT 
AGAGGCTGAG 
CGTACCACTG 
GCTTGTATAT 
TTTATCAAAA 
CCAAAAATAA 
ATTGTTTCTG 
TTGCGCTAAG 
AAAGAGAATA 
AGGCCAAGAC 
AACCCTGTCT 
CCATGTACTT 
GAGCTGAGAT 



51 
I 



120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 



SSLKDGGKAA QANVRIGDW LSIDGINAQG 
EPVPVQKGEP KEWKPVPIT SPAVSKVTST 
TPAHATTSSH ASPSPVAAVT PPLFAASGLH 



TYSGSVAPAN 



SCFVCVACGK 
DTCFVCSVCC 



WRPRTGTTQS RSFRILAQIT GTEHLKESEA 
DSPTSGRPGV TSLTTAAAFK PVGSTGVTKS 
SALGQTQPSD QDTLVQRAEH IPAGKRTPHC 
NTMAYIGFVE EKGALYCELC YEKFFAPECG 
PIRNNVFHLE DGEPYCBTDY YALFGTICHG 
ESLEGQTFFS KKDKPLCKKH AHSVNF 



60 
120 
180 
240 
300 
360 
420 
480 
540 



SEQ ID NOi53 PBH7 DNA SEQUENCE 

Nucleic Acid Accessions AA431407 

Coding sequence: 1-864 (underlined sequences correspond to start and stop codons) 
1 11 21 31 41 51 

I I I I I I 

ATGGCCAACT GTAAAATGAC CAAAAGCATC AGGTTCCCTG CCCTGGAGCA CTGCTATACT 
GGCGGGGAGG TCGTGTTGCC CAAGGATCAG GAGGAGTGGA AAAGACGGAC GGGCCTTCTG 
CTCTACGAGA ACTATGGGCA GTCGGAAACG GGACTAATTT GTGCCACCTA CTGGGGAATG 
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AAGATCAAGC CGGGTTTCAT GGGGAAGGCC ACTCCACCCT ATGACGTCCA GTTTCATATG 240 

GAOQCCTCAG TTGAAAACTG CATTATTGTG AGCATGAACA CCGCTGACCC TGGCAGCCAG 300 

GGCATCACAC ACAGCCTCTT GCTACAG6TC ATTGATGACA AGGGCAGCAT CCTGCCACCT 360 

AACACAGAAG GAAACATTGG CATCAGAATC AAACCTGTCA GGCCTGTGAG CCTCTTCATG 420 

TGCTATGAGG GTGACCCAGA GAAGACAGCT AAAGTGGAAT GTGGGGACTT CTACAACACT 480 

GGGGACAGAG GAAAGATGGA TGAAGAGGGC TACATTTGTT TCCTGGGGAG GAGTGATGAC 540 

ATCATTAATG CCTCTGGGTA TCGCATCGGG CCTGCAGAGG TTGAAAGCGC TTTGGTGGAG 600 

CACCCAGCGG TGGCGGAGTC AGCCGTGGTG GGCAGCCCAG ACCCGATTCG AGGGGAGGTG 660 

GTGAAGGCCT TTATTGTCCT GACCCCACAG TTOCTGTCCC ATGACAAGGA TCAGCTGACC 720 

AAGGAACTGC AGCAGCATGT CAAGTCAGTG ACAGCCCCAT ACAAGTACCC AAGGAAGGTG 780 

GAGTTTGTCT CAGAGCTGCC AAAAACCATC ACTGGCAAGA TTGAACGGAA GGAACTTCGG 840 

AAAAAGGAGA CTGGTCAGAT GTAATCGGCA GTGAACTCAG AACGCACTGC ACACCTGAGG 900 

CAAAOCCCTG GCCACTTTAG TCTCCCCACT ATGGTGAGGA CGAGGGTG6G GCATTGAGAG 960 

TGTTGATTTG GGAAAGTATC AGGAGTGCCA TGATTCCAAT GTTTTCCTTC TTTTAAATTA 1020 

AATTCAGTTG CTCTGCTTCC TCCAAGTCCT CTGTATCTTT AGAATTTCCC AGGTGAGCAC 1080 
TCATAACGCA AGTAATAAAA TACTGATATC AACAA 

$eq id Kp?s4PPH7 protein ggqwnge 

Protein Accession f: FGENESH predicted 

1 11 21 31 41 51 

I I I 1 I I 

MANCKMTKSI RFPALEHCYT GGEWLPKDQ EEWKRRTGLL LYENYGQSET GLICATYWGM 60 

KIKPGFMGKA TPPYDVQFHM EASVENCIIV SMNTADPGSQ GITHSLLLQV IDDKGSILPP 120 

NTEGNIGIRI KPVRPYSLFM CYBGDPEKTA KVECGDPYNT GDRGKMDEEG YICPLGRSDD 180 

IINASGYRIG PAEVESALVE HPAVAESAW GSPDPIRGEV VKAFIVLTPQ FLSHDKDQLT 240 
KELQQHVKSV TAPYKYPRKV EFVSELPKTI TGKIERKBLR KKBTGQM 

SEQ ID NO:55 PBJ5 DNA SEQUENCE 

Nucleic Add Accession «: AF388200 

Coding sequence: 33-137 (underlined sequences correspond to start and stop codons) 

l 11 21 31 41 51 

I I I I I I 

GAGAGAGGGA GGCAGAAGAG GAAGTCAGAG CQATGTGCTG TGAAATCTAC TACCGTTTGC 60 

TGGTTTTGAA AATGGAGAAA AAGAGTGAGG AACTGAGAAA CATGGATGGC CTTGGGAACG 120 

TGGAAAAGGG TCACTGAAAT GGGACGAC AT GAA CTCAAGG AGGCTATTTA TGACCATGTC 180 

ATTTGCAACA TGAAGAAAGC TTATCTGGAG TGAAAGTAAA TGAGACCAAC AGAGATAAGA 240 

GACCCGGAGA AATCCTGGTT ACACTGCTTG AATCCTGTCA GTCCTATACT GGAGTCCTGT 300 

TAATACAAAA TAATAGTAAT AATCCCTCTG TTTCTTATGT TTATGCCAAC TTCAACAAAA 360 

AGAAACTTGA CTAAGAGACA ATATAAGAAC TTAAT GTGTA ATTAAGAAAG A ACTCTCC AC 420 

CACGGGGAAT GTGAAAGGTA TATGAGTCCC TTTTCACGAT GCGATGTCAT GTCTTTTAAA 480 
TAAGCCATAC TTTATGTTCA ATAAAAAOAG AATAAGCAGG A 

SEQ ID NCh56 PBJ5 Protein sequence 
Protein Accession*: AAK83352 

1 11 21 31 41 51 

I I 1 I I I 

KCCEIYYRLL VLKMEKKSEE LKNMDGLGNV EKGH 

SEQ ID Nfc57 PBJ7 DNA SEQUENCE 

Nudeic Add Accession*: AA876910 

Coding sequence: 1-2064 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

ATGGACAGTT GCCTGCAACA TATGAGAGAC CTACTTTACC TCCTTCAGGA GCTCAGGTGT 60 

TTAAATCCAG CTACACTACT CCCTGATCCA GACTCCACTA CTCCTGTTCA TGACTGTCAG 120 

GATCTGTTGG AAACTACCAA AACTGGCCAA CCTGATCTTC AAGATGTGCC CCTAGAAAAG 180 

GCAGATGCCA CTGTGTTCAC AGATGGTAGC AGCTTCCTCG AGCAGGGAGA ACGAAAAGCT 240 

GWTOTm: CACAGCCAGA TCTGCCTGAC AATCCCACAT ACTCAACAGA AGAAGAAAAA 300 

CTGGCTTCAG ATGTTGGAGC AAATAAAAAT CAGGAAGGAC GTGTATTCGC AAACACTACT 360 

TGGAGGGCCG GTACCTCCAA GGAAGTCTCC TTTGCAGTTG ATTTATGTGT ACTGTTCCCA 420 

GAGCCAGCTC GTACCCATGA AGAGCAACAT AATTTGCCGG TCATAGGAGC AGGAAGTGTC 480 

GACCTTGCAG CAGGATTTGG ACACTCTGGG AGCCAAACTG GATGTGGAAG CTCCAAAGGT 540 

GCAGAAAAAG GGCTCCAAAA TGTTGACTTT TACCTCTGTC CTGGAAATCA CCCTGACGCT 600 

AGCTGTAGAG ATACTTACCA GTTTTTCTGC CCTGATTGGA CATGTGTAAC TTTAGCCACC 660 

TACTCTGGGG GATCAACTAG ATCTTCAACT CTTTCCATAA GTCGTGTTCC TCATCCTAAA 720 

TTATGTACTA GAAAAAATTG TAATCCTCTT ACTATAACTG TCCATGACCC TAATGCAGCT 780 

CAATGGTATT ATGGCATGTC ATGGGGATTA AGACTTTATA TCCCAGGATT TGATGTTGGG 840 

ACTATGTTCA CCATCCAAAA GAAAATCTTG GTCTCATGGA GCTCCCCCAA GCCAATCGGQ 900 

CCTTTAACTG ATCTAGGTGA CCCTATATTC CAGAAACACC CTGACAAAGT TGATTTAACT 960 

GTTCCTCTGC CATTCTTAGT TCCTAGACCC CAGCTACAAC AACAACATCT TCAACCCAGC 1020 

CTAATGTCTA TACTAGGTGG AGTACACCAT CTCCTTAACC TCACCCAGCC TAAACTAGCC 1080 

CAAGATTGTT GGCTATGTTT AAAAGCAAAA CCCCCTTATT ATGTAGGATT AGGAGTAGAA 1140 

GCCACACTTA AACGTGGCCC TCTATCTTGT CATACACGAC CCCGTGCTCT CACAATAGGA 1200 

GATGTGTCTG GAAATGCTTC CTGTCTGATT AGTACCGGGT ATAACTTATC TGCTTCTCCT 1260 

TTTCAGGCTA CTTGTAATCA GTCCCTGCTT ACTTCCATAA GCACCTCAGT CTCTTACCAA 1320 

GCACCCAACA ATACCTGGTT GGCCTGCACC TCAGGTCTCA CTCGCTGCAT TAATGGAACT 1380 
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GAACCACGAC CTCTCCTGTG CGTGTTAGTT CATGTACTTC CCCAGGTATA TGTGTACAGT 1440 

GGACCAGAAG GACGACAACT CATCGCTCCC CCTOAGTTAC ATCOCAGGTT GCACCAAGCT 1500 

GTCCCACTTC TGGTTCCCCT ATTGGCTG6T CTTAGCATAG CTGGATCAGC AGCCATTGGT 1560 

ACGGCTGCCC TGGTTCAAGG AGAAACTGGA CTAATATCCC TGTCTCAACA GGT6GAT6CT 1620 

GATTTTA6TA ACCTCCAGTC TGCCATAGAT ATACTACATT CCCAGCTAGA GTCTCTGGCT 1680 

GAAGTAGTTC TTCAAAACTG CCGATGCTTA GATCTGCTAT TCCTCTCTCA AGGAGGTTTA 1740 

TGTGCAGCTC TAGGAGAAAG TTGTTGCTTC TATGCCAATC AATCTGGAGT CATAAAAGGT 1800 

ACAGTAAAAA AAGTTCGAGA AAATCTAGAT AGGCACCAAC AAGAACGAGA AAATAACATC 1860 

CCCTGGTATC AAAGCATGTT TAACTGGAAC CCATGGCTAA CTACTTTAAT CACTGGGTTA 1920 

GCTGGACCTC TCCTCATCCT ACTATTAAGT TTAATTTTTG GGCCTTGTAT ATTAAATTCG 1980 

TTTCTTAATT TTATAAAACA ACGCATAGCT TCTGTCAAAC TTACGTATCT TAAGACTCAA 2040 
TATGACACCC TTGTTAATAA C TGA 

S.SQ p n PPJ7 PfBMn SWMTO 

Protein Accession ft FGENESH predicted 

1 11 21 31 41 51 

I I. I I I I 

HDSCLQHMRD LLYLLQELRC LNPATLLPDP DSTTPVHPCQ DLLETTKTGQ PDLQDVPLEK 60 

ADATVFTDGS SPLBQGERKA VSFPQPDLFD NFTYSTEEBK LASDVGANKH QEGRVFANTT 120 

WRAGTSKEVS FAVDLCVLFP EPARTHEEQH NLPVIGAGSV DLAAGFGHSG SQTGCGSSKG 180 

AEKGLQNVDF YLCPGNHPDA SCRDTYQFFC PDWTCVTLAT YSGGSTRSST LSISRVPHPK 240 

LCTRKNCNPL TITVHDPNAA QWYYGMSWGL RLYIPGFDVG TMFTIQKKIL VSWSSPKPIG 300 

PLTDLGDPIP QKHPDKVDLT VPLPFLVPRP QLQQQHLQPS LMSILGGVHH LLNLTQPKLA 360 

QDCWLCLKAX PPYYVGLGVE ATLKRGPLSC HTRPRALTIG DVSGNASCLI STGYNLSASP 420 

FQATCNQSIiL TSISTSVSYQ APNNTWLACT SGLTRCINGT EPGPLLCVLV HVLPQVYVYS 480 

GPEGRQLIAP PELHPRLHQA VFLLVPLLAG LSZAGSAAIG TAALVQGETG LISLSQQVDA 540 

DFSNLQSAID ILHSQVESLA EWLQNCRCL DLLPLSQGGL CAALGB5CCF YANQSGVZKG 600 

TVKKVRENLD RHQQERENN2 PWYQSMFNWN PWLTTLITGL AGPLLILLLS LIFGPCILNS 660 
FLNP1KQRIA SVKLTYLKTQ YDTLVNN 

SEQ ID N0-.59 PCQ1 DNA SEQUENCE 

Nudelc Acid Accession*: NM_0 19005 

Cooing seQuence; 182-1685 (underQned sequences correspmd to start and stopcodorcs) 

1 11 21 31 41 51 

I I I I I I 

TGATGGTGGA AATTTCTTGA AACCGCTCTC GTAATTTGCC ACGTGCTGTT GCAAATATTC 60 

TGGTGAATGA ACACAGAATC AGCATGGCTT TCCTTTGCTG AGAAATCACT GATGGGAAGT 120 

GAGACTTGTT AAACTTGAAA GTGAATGGAC CTGAGTGGAC CCTTTGATCA CATCAGTAAA 180 

CATGA GCGGT ACCAAACCTG ATATTTTATG GGCACCACAC CATGTTGATA GATTTGTTGT 240 

GTGTGACTCA GAACTAAGTC TTTATCATQT GGAATCTACT GTGAATTCAG AACTCAAAGC 300 

TGGATCTTTA CGTTTATCTG AAGACTCTGC AGCTACATTA CTGTCAATAA ATTCAGATAC 360 

ACCCTATATG AAATGTGTTG CCTGGTATCT TAATTATGAT CCTGAATGTC TGCTGGCAGT 420 

TGGACAAGCA AATGGTCGAG TTGTACTTAC AAGCCTT6GT CAAGATCATA ACTCAAAGTT 480 

CAAAGATTTG ATAGGAAAAG AGTTTGTTCC AAAACATGCA CGACAATGTA ATACCCTTGC 540 

CTGGAATCCA CTGGATAGTA ACTGGCTAGC TGCTGGTTTA GATAAGCACA GAGCTOACTT 600 

TTCAGTGCTA ATATGGGATA TCTGCAGCAA ATATACTCCT GATATAGTTC CCATGGAAAA 660 

AGTGAAACTT TCAGCAGGTG AAACTGAAAC AACATTATTA GTAACAAAAC CACTTTATGA 720 

GTTAGGACAG AATGATGCTT GTCTGTCTCT TTGTTGGCTT CCACGAGACC AGAAACTTCT 780 

CCTTGCTGGT ATGCATCGTA ACCTAGCTAT ATTTGATCTT CGGAATACAA GCGAAAAGAT 640 

GTTCGTAAAT ACAAAAGCTG TTCAGGGTGT GACGGTAGAC CCAXATTTCC ACGATCGTGT 900 

TGCTTCCTTC TATGAAGGTC AGGTTGCAAT ATGGGATCTT AGAAAATTTG AGAAGCCAGT 960 

TTTGACATTG ACTGAGCAAC CAAAACOCTT AACAAAAGTA GCATGGTGTC OCACTAGGAC 1020 

TGGTCTACTT GCCACTTTAA CAAGGGATAG TAATATTATT AGATTGTATG ATATGCAGCA 1080 

TACACCCACT COCATTGGGG ATGAAACTGA ACCCACAATA ATTGAAAGAA GTGTGCAACC 1140 

TTGTGACAAT TACATTGCTT CCTTTGCGTG GCATCCAACA AGTCAAAATC GAATGATAGT 1200 

TGTAACTCCC AACCGAACAA TGTCAGACTT CACTGTTTTT GAAAGGATAT CTCTTGCCTG 1260 

GAGCCCAATT ACATCTTTAA TGTGGGCTTG TGGTCGTCAT TTATATGAAT GTACGGAAGA 1320 

AGAAAATGAT AATTCTTTAG AAAAAGATAT AGCAACGAAG ATGCGTCTTC GGGCTTTATC 1380 

AAGGTATGGA CTTGATACAG AGCAGGTGTG GAGGAACCAC ATTTTAGCTG GAAATGAAGA 1440 

TCCACAGCTC AAGTCACTCT GGTATACTCT GCACTTTATG AAGCAATACA CAGAAGATAT 1500 

GGATCAGAAA TCTCCAGGCA ACAAAGGATC ATTGGTTTAT GCAGGAATTA AATCAATTGT 1560 

AAAGTCATCG TTGGGAATGG TGGAAAGCAG CAGACATAAT TGGAGTGGGT TGGATAAGCA 1620 

AAGTGATATT CAAAACTTAA ATGAAGAGAG AATCTTAGCT TTACAGCTTT GTGGGTGGAT 1680 

AAAGAAAGGA ACGGATGTAG ACGTGGGGCC ATTTTTGAAC TCCCTTGTAC AAGAAGGGGA 1740 

ATGGGAAAGA GCTGCTGCTG TGGCATTGTT CAACTTGGAT ATTCGCCGAG CAATCCAAAT 1800 

CCTGAATGAA GGGGCATCTT CTGAAAAAGG CAGGAGATCT GAATCTCAAT GTGGTAGCAA 1860 

TGGCTTTATC GGGTTATACG GATGAGAAGA ACTCCCTTTG GAGAGAAATG TGTAGCACAC 1920 

TGCGATTACA GCTAAATAAC CCGTATTTGT GTGTCATGTT TGCATTTCTG ACAAGTGAAA 1980 

CAGGATCTTA CGATGGAGTT TTGTATGAAA ACAAAGTTGC AGTACGTGAC AGAGTGGCAT 2040 

TTGCTTGTAA ATTCCTTAGT GATACTCAGA TACATCGAAA AGTTGACCAA TGAAATGAAA 2100 

GAGGCTGGAA ATTTGGAAGG AATTTTGCTT ACAGGCCTTA CTAAAGATGG AGTGGACTTA 2160 

ATGGAGAGTT ATGTTGATAG AACTGGAGAT GTTCAAACAG CAAGTTACTG TATGTTACAG 2220 

GGTTCACCTT TAGATGTTCT TAAAGATGAA AGGGTTCAGT ACTGGATTGA GAATTATAGA 2280 

AATTTATTAG ATGCCTGGAG GTTTTGGCAT AAACGAGCTG AATTTGATAT TCACAGGAGT 2340 

AAGTTGGATC CCAGTTCCAA GCCTTTAGCA CAAGTTTTTG TGAGTTGCAA TTTCTGTGGC 2400 

AAGTCAATCT CCTACAGCTG TTCAGCTGTG CCTCATCAGG GCAGAGGTTT TAGTCAGTAT 2460 

GGTGT6AGTG GCTCACCAAC GAAATCTAAA GTCACAAGTT GTCCTGGCTG TCGAAAACCA 2520 

CTTCCTCGAT GTGCGCTTTG TCTCATTAAT ATGGGAACAC CAGTTTCTAG CTGTCCTGGA 2580 



324 



WO 02/30268 



PCT/US01/32045 



GGAACCAAAT CAGATGAAAA AGTGGACTTG AGCAAGGACA AAAAATTAGC CCAATTTAAC 2640 

AACTGGTTTA CATGGTGTCA TAATTGCAGG CAC66T60AC ATOCTGGACA TATGCTTA6T 2700 

TGGTTCAGGG ACCATGCAQA GTGCCCTGTG TCTGCATGCA CGTGTAAAT6 TATGCAGTTG 2760 

GATACAACGG GGAATCTGGT ACCTGCAGAQ ACTGTOCAGC CATAAAATGT TACCACCTTA 2820 

5 AGAGAACCCT TCAAGTGTGG AGCTTTCTAG TAGGTGTCCT TCATAGCTCA GAAACATACC 2880 

TCAGAACAAG CCATTCATGA CTTACCTGTA ATGGGAAAAT AAATCATTCT ATCAGAAAAA 2940 
AAAAAAAAAA AAAAAAAAAA 

SEQIDNO:60PCQ1 Protein sequence 
10 Protein Accession #: NPJW1878 

1 11 21 31 41 51 

I I I I I I 

_ MSGTKPDILW APHHVDRFW CDSELSLYHV ESTVNSELKA GSLRLSBDSA ATLLSINSDT 60 

15 PYMKCVAWYL NYDPECLLAV GQANGRWLT SLGQDHNSKF KDLIGKEFVP KHARQOJTLA 120 

WNPLDSNWLA AGLDKHRADF SVLIWDICSK YTPDIVPMEK VKLSAGETET TLLVTKPLYE 180 
LGQNDACLSL CWLPRDQKLL LAGMHRNLAI FDLRNTSQKM FVNTKAVQGV TVDPYFHDRV - 240 

ASFYEGQVAI WDLRKFEKPV LTLTEQPKPL TKVAWCPTRT GLLATLTRDS NIIRLYDMQH 300 

TPTPIGDETE PTIIERSVQP CDNYIASFAW HPTSQNRMIV VTPNRTMSDF TVFERISLAW 360 

20 SPITSLMWAC GRHLYECTEE ENDNSLEKDI ATKMRLRALS RYGLDTEQVW RNHILAGNED 420 

PQLKSLWYTL HFMKQYTEDM DQKSPGNKGS LVYAGIKSIV KSSLGMVESS RHNWSGLDKQ 480 

SDIQNLNEER ILALQLCGWI KKGTDVDVGP FLNSLVQEGE WERAAAVALF NLDIRRAIQI 540 
LNEGASSEKG RRSESQCGSN GFIGLYG 

25 SEQ ID NO:61 PDG3 DNA SEQUENCE 

Nudeic Add Accession* U42359 

Cooing sequence: 563-775 (undeffined sequences correspond to start and slop codons) 
„ ' l 11 21 31 41 51 

30 | | | | | | 

TTGTACATCT TAACAACCTT AAGCTGTACA AATAGANCAA TAATATCTAA ATGGTGTGAT 60 

GATCAGCCCA CAGTACACAT CATTGATGAG AATTTCACTG GTCTCAACCT TTCTCATGCT 120 

GAGTCCTGGC TTTGTAAAAT GACTTATAAA GGTOCAAGGA TTTAGAGATG ATTAAGAGAT 180 

AAGCTGGCAT TCTGTAAAGG CACCATCGTC TATCOCCTGT CTTATCTAGA TAAAGAATGT 240 

AGTGCTAAAT CTTGTAATAA TATTGTACAA ATGGAAATTC AATCTTAAGG ATTATTTTTT 300 

CCATATTGTT GTATTTCATT GTGGTGTATT GGAAAGTGAT CTGGACTTTG AGTGAGAAGA 360 

TGTGATTTGG ACCATGGCAC TTAAAAACTC TATAACCTCA GGCAAGTCTT TTAATCTTCT 420 

CTGAGCCTCA GTTTTCCTCA TTTTTCAAAT ATAGAGAGTA TAACATTTAT CTCATAAGAC 480 

AAGTTGTAGT AAATTACTGT TTTACAAATG TAAGATAACT TTTAACTGTG AGATTCCATA 540 

TTCCAGTCTT ACATTAT TAT G TTTATC TGC CACAGGGAGA AGTCCTCAGA TAAAAATGTC 600 

TACCAAAAGA CTGACACGTG GAGTTAATCA TTTGACAGAT GCAAATGCTT CCACCCCCAA 660 

CAAATATACT TTCTTTAACT TCTGTGTGGG TATCACTTAG GGAAAAAAAG GCAGGCAACA 720 

AAATATTTTT TAATTCTATC TTAGGAAAAA TTGTAGNCAA ATCTTTTTNT CCCATTAACA 780 

AATAATGTAA GCCTTAATAT TCAAGGGGTA ATAAAAATAC AAAGTCTTCC AAACAGGTAA 840 
CTTACTTGAA AACTTT 
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SEQ P N0:fi2 PDG3 Protein seouence 
Protein Accession #: AAB18375 



1 11 21 31 41 51 

I I I I I I 

MGARGAPSRR RQAGRRLRYL PTGSFPFLLL LLLLCIQLGG GQKKKKNLLA EKVEQLMEWS 60 

SRRSIFRMNG DKFRKFIKAP PRN7SHXVHF TALQPQRQCS VCRQANEEYQ ILANSWRYSS 120 

AFCKKLFFSM VDYDEGTDVF QQLNMNSAPT PXHXPPKGRP KRADTFDLQR IGFAAEQLAK 180 

55 WIADRTDVHI RVFRPPNYSG TIALALLVSL VGGLLYXRRN NLEFIYNKfG WAMVSLCIVF 240 

AMTSGQMWNH IRGPPYAHKK PHNGQVSYIH GSSQAQFVAE SHTILVLNAA ITMGMVLLNE 300 
AATSKGDVGK RRIICLVGLG LWFFFSFLL SIFRSKYHGY FYSDLDPE 

SEQ ID N0:63 PDG8 DNA SEQUENCE 

60 Nucleic Add Accession #: AL080235 

Coding sequence: 245453 (underlined sequences correspond to start and stop codons) 

l 11 21 31 41 51 
« I I I I I I 

OD GGTCGCCGCA CCGGCCGOCT CCGGCCCGOC GCCGCCCCCA GCGCCGCCGC CGCCACCGCC 60 

GGGGCGCCCA CCGCGCTGCC AGCCTACCCC GCGGCGGAGC CGCCCGGGCC GCTGTGGCTG 120 

CAGGGCGAGC CGCTGCATTT CTGCTGCCTA GACTTCAGCC TGGAGGAGCT GCAGGGCGAG 180 

CCGGGCTGGC GGCTGAACCG TAAGCCCATT GAGTCCACGC TGGTGGCCTG CTTCATGACC 240 

CTGGTCATCG TGGTGTGGAG CGTGGCCGCC CTCATCTGGC CGGTGCCCAT CATCGCCGGC 300 

70 TTCCTGCCCA ACGGCATGGA ACAGCGCCGG ACCACCGCCA GCACCACCGC AGCCACCCCC 360 

GCCGCAGTGC CCGCAGGGAC CACCGCAGCC GCCGCCGCCG CCGCCGCTGC CGCCGCCGCC 420 

GCGGCCGTCA CTTCGGGGGT GGCGACCAAG TGACCCGCTC CGCTCCTCCC TGTGTCCGTC 480 

CTGTGTCCGC GCGCGCGGGT GCCTTTCCCG CCGGGGACTC GGCGGGTGTG CTTCGTGCTG 540 

„ TAGTTATCGT TAGTTCCTCT TCCCGAGATG GGGCCGCCGA GAGACCCCAG CGCCTTTGAA 600 

75 AAGCAAGGTT TGTGCTGCGC TTCCAGTTCC GAAAAGCAGA TGTTTAAGCC CTTGGACTGA 660 

GGGTGGGATC GCAGCTCCGA AGACGGAGAG GAGGGAAATG GGGCCCTTTC CCCTCTATTG 720 

CATCCCCCTG CCCGACTCCT TCCCCGCACC CACGTGCCCT AGATTCATGG CAGAAAATGA 780 

CCAAATCCTG TGTATTTGTT TTATATATTT AATAACTGTT TTAAATGAAA GTTTTAGTAA 840 

0 AAAAAATACA AAACAAAAAG ATTAAATTGC TATTG CTGTA GTAAGAGAAG CTCTTTGTAT 900 

OO CTGAACATAG TTGTATTTGA AATTTGTGGT TTTTTAATTT ATTTAAAATT GGGGGGAGGG 960 
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CATGGGAAGG ATTTAACACC GATATATTGT TACCGCTGAA AATGAACTTT ATGAACCTTT 1020 

TCCAAGTTGA TCTATCCAGT GACGTGGCCT GGTGGGCGTT TCTTCTTGTA CTTATGTGGT 1080 
TTTTTGGCTT TTAATACAGA CATTTTCCTC CAAAAAAAAA AAAAAAAAGG 

1 11 21 31 41 51 

in 1 I I I I I 

1U GRRTGRLRPA AAPSAAAATA GAPTALPAYP AAEPPGPLWL QGEPLHFCCL DFSLEELQGE 60 

PGWRUIRKPI ESTLVACFMT LVIWWSVAA LIWPVPIIAG FLPNGMEQRR TTASTTAATP 120 
AAVPAGTTAA AAAAAAAA AA AAVTSGVATK 

SEQ ID NO:65 PDM1 DNA SEQUENCE 

15 Nucldc Acid Accession ft NM_006765 

Coding sequence 149-1 195 (underlined sequences correspond b start and stop codons) 

. 1 11 21 31 41 51 

20 | | I I I I 

CGGCCGCGGC CCGGGTCCCT CGCAAAGCCG CTGCCATCCC GGAGGGCCCA GCCAGCGGGC 60 

TCCCGGAGGC TGGCCGGGCA GGCGTGGTGC GCGGTAGGAG CTGGGCGCGC ACGGCTACCG 120 

CGCGTGGAGG AGACACTGCC CTGCCGC GAT GG GGGCCCGG GGCGCTCCTT CACCCCGTAG 180 

. GCAAGCGGGG CGGCGGCTGC GGTACCTGCC CACCGGGAGC TTTCCCTTCC TTCTCCTGCT 240 

25 GCTGCTGCTC TGCATCCAGC TCGGGGGAGG ACAGAAGAAA AAGGAGAATC TTTTAGCTGA 300 

AAAAGTAGAG CAGCTGATGG AATGGAGTTC CAGACGCTCA ATCTTCCGAA TGAATGGTGA 360 

TAAATTCCGA AAATTTATAA AGGCACCACC TCGAAACTAT TCCATGATTG TTATGTTCAC 420 

TGCTCTTCAG OCTCAGCGGC AGTGTTCTGT GTGCAGGCAA GCTAATGAAG AATATCAAAT 480 

ACTGGOGAAC TCCTGGOGCT ATTCATCTGC TTTTTGTRAC AAGCTCTTCT TCAGTATGGT 540 

30 GGACTATGAT GAGGGGACAG ACGTTTTTCA GCAGCTCAAC ATGAACTCTG CTCCTACATT 600 

CAYGCATTTW CCTCCAAAAG GCAGACCTAA GAGAGCTGAT ACTTTTGACC TCCAAAGAAT 660 

TGGATTTGCA GCTGAGCAAC TAGCAAAGTG GATTGCTGAC AGAACGGATG TTCATATTCG 720 

GGTTTTCAGA CCACCCAACT ACTCTGGTAC CATTGCTTTG GCCCTGTTAG TGTCGCTTGT 780 

. c TGGAGGTTTG CTTTATTNGA GAAGGAACAA CTTGGAGTTC ATCTATAACA AGACTGGTTG 840 

35 GGCCATGGTG TCTCTGTGTA TAGTCTTTGC TATGACTTCT GGCCAGATGT GGAACCATAT 900 

CCGTGGACCT CCATATGCTC ATAAGAACCC ACACAATGGA CAAGTGAGCT ACATTCATGG 960 

GAGCAGCCAG GCTCAGTTTG TGGCAGAATC ACACATTATT CTGGTACTGA ATGCCGCTAT 1020 

CACCATGGGG ATGGTTCTTC TAAATGAAGC AGCAACTTCG AAAGGCGATG TTGGAAAAAG 1080 

ACGGATAATT TGCCTAGTGG GATTGGGCCT GGTGGTCTTC TTCTTCAGTT TTCTACTTTC 1140 

40 AATATTTCGT TCCAAGTACC ACGGCTATCC TTATAGTGAT CTGGACTTTG AGTGAGAAGA 1200 

TGTGATTTGG ACCATGGCAC TTAAAAACTC TATAACCTCA GCTTTTTAAT TAAATGAAGC 1260 

CAASTGGGAT TTGCATAAAG TGAATGTTTA CCATGAAGAT AAACTGTTCC TGACTTTATA 1320 

CTATTTTGAA TTCATTCATT TCATTGTGAT CAGCTAGCTT ATTCTTGTGT ACTTTTTTTA 1380 

AACTGTGGGT TTTCCTAGTA AATTTAATTT ACAGAAATCA ATGGTAGCAT TTAGTAATCT 1440 

45 ACAAAGGAAA TATCAAAGTG TTTTTCAAGC CTGTTATATY CAGTGTGTKC CACAGGATTG 1500 
CAATAAATGA CAATGTAATT A 

_ A SEQIDNO:66PPM1P^s^encg; 
50 Protein Accession ft NPJJ06756 

1 11 21 31 41 51 

I I I I I I 

_ _ MGARGAPSRR RQAGRRLRYL PTGSFPFLLL LLLLCIQLGG GQKKKENLLA EKVEQLMEWS 60 

55 SRRSIFRMNG DKFRKPIKAP PRNYSMIVMF TALQPQRQCS VCRQANBEYQ ILANSWRYSS 120 

AFCNKLFPSM VDYDEGTDVF QQLNMNSAPT FXHXPPKGRP KRADTPDLOR IGFAABQLAK 180 

WIADRTDVHT RVFRPPNYSG TXALALLVSL VGGLLYXRRU NLEPIYNKTG KAHVSLCIVP 240 

AMTSGQKWNH IRGPPXAHKH FBNGQVSYZH GSSQAQFVAE SHIILVLNAA ITHGMVLLNE 300 
AATSKGDVGK RRIICIiVGLG LWFFFSPLL SIFRSKYHGY PYSDIiDFE 
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SEQ ID NO-.67 PDM2 DNA SEQUENCE 

Nucleic Acid Accession ft NML000947 

Coding sequence: 88-1617 (underlined sequences correspond to start and slop codons) 



1 11 21 31 41 51 

I I I I I I 

GGTTTCATAT GAACTCTCCC GCCACCCGGG AACAGCTGGC TGCCACCGTT TGTGTTTTCC 60 

70 GAGTTTGTAT TCTTGCAGGT GACCAAGATG GAGTTTTCTG GAAGAAAGCG GAGGAAGCTG 120 

AGGTTGGCAG GTGACCAGAG GAATGCTTCC TACCCTCATT GCCTTCAGTT TTACTTGCAG 180 

CCACCTTCTG AAAACATATC TTTAACAGAA TTTGAAAACT TGGCTATTGA TAGAGTTAAA 240 

TTGTTAAAAT CAGTTGAAAA TCTTGGAGTG AGCTATGTGA AAGGAACTGA ACAATACCAG 300 

AGTAAGTTGG AGAGTGAGCT TCGGAAGCTC AAGTTTTCCT ACAGAGAGAA GCTAGAAGAT 360 

75 GAATATGAAC CACGAAGAAG AGATCATATT TCTCATTTTA TTTTGCGGCT TGCTTATTGC 420 

CAGTCTGAAG AACTTAGACG CTGGTTCATT CAACAAGAAA TGGATCTCCT TCGATTTAGA 480 

TTTAGTATTT TACCCAAGGA TAAAATTCAG GATTTCTTAA AGGATAGCCA ATTGCAGTTT 540 

GAGGCTATAA GTGATGAAGA GAAGACTCTT CGAGAACAGG AGATTG TTGC CTCATCACCA 600 

AGTTTAAGTG GACTTAAGTT GGGGTTCGAG TOCATTTATA AGATCCCTTT TGCTGATGCT 660 

80 CTGGATTTGT TTCGAGGAAG GAAAGTCTAT TTGGAAGATG GCTTTGCTTA CGTACCACTT 720 
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AAGGACATTG TGGCAATCAT CCTGAATGAA TTTAGAGCCA AACTGTCCAA G6CTTTGGCA 780 

TTAACAGCCA GGTCCTTGCC TGCTGTGCAG TCTQATGAAA GACTTCAGCC TCTGCTCAAT 840 

CACCTCAGTC ATTCCTACAC TGGCCAAGAT TACAGTACCC AGGGAAATGT TGGGAAGATT 900 

TCTTTAGATC AGATTGATTT GCTTTCTACC AAATCCTTCC CACCTTGCAT OCGTCAGTTA 960 

CATAAAGCCT TGCGGGAAAA TCACCATCTT CGTCATGGAG GCCGAATGCA GTATGGCCTA 1020 

TTTCTGAAGG GCATTQGTTT AACTTTGGAA CAGGCATTGC AOTTCTGGAA GCAAGAATTT 1080 

ATCAAAGGAA AGATGGATCC AGACAAGTTT GATAAAGGTT ACTCTTACAA CATCCQTCAC 1140 

AGCTTTGGAA AGGAAGGCAA GAOQACAGAC TATACACCTT TCAOTTQCCT GAAGATTATT 1200 

CTGTCCAATC CACCAAGCCA AGGGGATTAT CATGGGTGCC CATTCCGTCA CAGTGATCCA 1260 

GAGCTGCTGA AGCAAAAGTT GCAGTCATAC AAGATCTCTC CTGGAGGGAT AAGCCAGATT 1320 

TTGGATTTAG TAAAGGGGAC ACATTACCAG GTAGCCTGTC AAAAATACTT TGAGATGATA 1360 

CACAATGTGG ATGATTGTGG CTTTTCTTTG AATCATCCTA ATCAGTTCTT TTGTGAGAGC 1440 

CAACGTATTC TAAATGGTGG TAAAQACATA AAGAAGGAAC CTATCCAACC AGAAACTCCT 1500 

CAACCCAAAC CAAGTGTCCA GAAAACCAAG GATGCATCAT CTGCTCTGGC CTCTTTAAAT 1560 

TCCTCTCTGG AAATGGATAT GGAAGGACTA GAAGATTACT TTAGTGAAGA TTCTTAGGCA 1620 

GTTTTATAAC CCTTTTTCCT CAATAGCCTG TTTCCTGTTT TTAAGATTTT GCCTTTGTTG 1680 

TTGAAAAAGG GTTTCACTGT CACCAAGGCT TAGTGCAGTG ACACAATTAC AGCTGATTGC 1740 

AGCCTTGACC TTCCCAGCTC AAGTGATCCT CCTACCTCAG CCTCCCAAGT AGTTAGGACA 1800 

CACAGGTGTG CACCTCATAT CCAGATAATT TTTTTCAATT TTTTTTTGTA GAGGTGGGGG 1860 

GTCTCCCTAT GTTGCCCAGG CAGATCTCAG ACTCCTGGGC TCAAGCGATC CTCACACCTC 1920 

AGCGTCCCAG AGTGCTGGGA TTACAGTTGT GAGCCACTGT GCCTGGCCTT TTTTTTTTTT 1980 

TAACCTTTTC GTTTAACTTC TCTCTTCACT GCATCCCAAT CCATCTACAG GCATGCACAC 2040 

TTATTAGGAA AGGAGGTTTG AGGTAACAAC AGAGACTTTC ACTATATTTT GCTTTGACAG 2100 

AAGGAAAGAG GAGGAGTTTC TATTAAAATC TGTCACTTGA GTGATGTCAT TTAAGTCCTA 2160 

TTTTAGGAGA TAAAAACAGC TTTGGGGACT GGTTAAAGTC CCCCAGAAAC TACA ATAAAG 2220 

AACAACTTTT GTTTTAACTC TTAATCACTT TGTAATTTTG ACTCAATCCT TTTCTGGACC 2280 
ATTTTTGTTA ATAAATATCA AAGTGT 



$PQfDWH68ppM2Pr ? t»l$gqvgncg; 
Protdn Accession ft NP.000938 

1 11 21 31 41 51 

I I I I I I 

MEFSGRKRRK LRLAGDQRNA SYPHCLQFYL QPPSENISLT EFENLAIDRV KLLKSVENLG 60 

VSYVKGTEQY QSKLESELRK LKFSYHEKLE DEYEPRRRDH ISHFILRLAV CQSEELRRWF 120 

XQQEMDLLRF RFSILPKDKI QDFLKDSQLQ FEAISDEEKT LREQEIVASS PSLSGLKLGF 180 

ESIYKIPFAD ALDLFRGRKV YLEDGFAYVP LKDIVAIILN EFRAKLSKAL ALTARS LP AV 240 

QSDERLQPLL NHLSHSYTGQ DYSTQGNVGK ISLDQIDLLS TKSFPPCMRQ LHKALRENHH 300 

LRHGGR25QYG LFLKGIGLTL EQALQFWKQE FIKGKMDPDK FDKGYSYNIR KSFGREGKRT 360 

DYTPFSCLKI ILSNPPSQGD YHGCPFRHSD PELLKQKLQS YKISPGGISQ ILDLVKGTHY 420 

OVACQKYFEM IHNVDDCGPS LNHPNQFFCE SQRILNGGKD IKKEPIQPET PQPKPSVQKT 480 
KDAS5ALASL NSSLEMDMEG LEDYFSEDS 

SEQ ID NOt69 P0M3 DNA SEQUENCE 

Nudeto Add Accession i: NM.Q24840 

Coding sequence: 108491 (undaflned sequences correspond to start and stop radons) 

1 11 21 31 41 51 

I I I I I I 

AATTCATACA GGAGAGAAGT CATATATATG CAGTGATTGT GGAAAAGGCT TCATCAAGAA 60 

GTCTCGGCTC ATTAATCATC AGAGAGTTCA TACAGGAGAG AAACCACATG GATGCAGOCT 120 

GTGTGGGAAG GCCTTCTCCA AAAGGTCCAG GCTCACTGAA CACCAGAGAA CTCATACAGG 180 

AGAGAAGCCC TATGAATGCA CTGAATGTGA CAAAGCATTC CGCTGGAAAT CACAGCTCAA 240 

TGCACATCAG AAAGCTCACA CAGGAGAGAA GTCATATATA TGCCGTGATT GTGGAAAAGG 300 

CTTCATTCAG AAGGGAAATC TCATTGTACA TCAGCGAATT CATACTGGAG AAAAACCCTA 360 

TATATGCAAT GAATGTGGAA AAGGCTTCAT CCAAAAGGGC AACCTCCTTA TTCATCGACG 420 

TACTCACACT GGAGAGAAAC CCTATGAATG CAATGAATGT GGGAAAGGCT TCAGCCAGAA 480 

GACATGT TTA ATATCCCATC AGAGATTTCA CACAGGAAAG ACACCCTTTG TATGTACTGA 540 

GTGTGGAAAA TCCTGCTCAC ACAAGTCAGG TCTCATTAAC CACCAGAGAA TTCACACAGG 600 

AGAGAAACCC TATACATGCA GTGACTGTGG GAAAGCTTTC AGAGATAAAT CATGTCTCAA 660 

CAGACATCGG AGAACTCATA CAGGGGAGAG ACCGTATGGA TGCTCTGATT GTGGGAAAGC 720 

TTTCTCCCAC TTGTCATGCC TTGTTTATCA TAAGGGAATG CTGCATGCAA GAGAGAAATG 780 

TGTAGGTTCA GTCAAATTGG AAAATCCTTG CTCAGAGAGT CATAGCTTAT CACATACACG 840 

TGATCTCATA CAGGATAAAG ACTCTGTTAA CATGGTGACT CTGCAGATGC CTTCTGTGGC 900 

AGCTCAGACC TCATTAACTA ACAGTGCGTT CCAAGCAGAG AGCAAAGTAG CCATTGTGAG 960 

CCAGCCTGTT GCCAGAAGTT CAGTCTCAGC AGATAGTAGA ATTTGCACAG AATAAAAACC 1020 

ATATGAATGC AGTGAATGTG GTAGTGCTTT CAGTGATCAA TTACATCATA TGTCACAAAA 1080 

AACACAGAGG AACAAACTGA TATATTCAAG GTGGAAAGCC CTTGAATAAA ACCTTATGGC 1140 

TAATAAGCAT ATACTCAGAG AAAAATAGTA TGAAGTGGAG ACTGGGAAAT TCTTTTATGG 1200 

GAAGATAGAT CTTCTCATCA GTGACCATAG ATCACATCTT CAGTGAGCTT ATAGTTGGTA 1260 

GAAATATAAT GATCATGGAA AAGTCCTTGT TCAGAAACAG TACGCCAGTA GGTATCAGGG 1320 

GGTTTACACA GGAGAGAAAC TTTTGGAAGA CCTTTGAAGG CTATGAATGT GGCAGGGTTG 1380 

CTAGTGOTAC ATTCTGCCTT ATCCTCAGAG GGAATCATAT AGAAATAAAA CTATGAAAAT 1440 

GTAACTAGAA CATCTTCATC AAAATATGAA AGAACACACG AAGCAAATAA GCCCTGTGAA 1500 

AAGGAGTATT TTAGAGATTT CGATCAGAAA TCTAACATCA TTATATGGCA GATAATATAC 1560 

AGGATGTGTA TTTTAGGACA ATATACCTTG AATCACTAGT TGATATGTCA ATGACTAATT 1620 

AAAAGGGGTT GTCAGTGTTA CACATCATTG GTTAAATTTA TAGCACAATG TACCTCTTCC 1680 

CCCTTTTTTG ATAAGAGTCT TCTATTCCCA ACCAAGATCA TTATATGATT AGCTCTTGTG 1740 

TTTCTTTGAT TCCAAATTTC TTCACTTGTT ATTTCAGACT ACTGAAGCTC TTCAAAAGGA 1800 
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AAAATGTATT TAATTTAATA ATGTAACACA ACAAQTTTGO ATGTGTTTAA CTTTATAAAT I860 
AATCACCCCA GAGGAATGAA GTTCAAAACT TGTGAATAAC C 

Protein Accession*: np_079U6 

1 11 21 31 41 51 

l\) HDAACVGRPS PKGPGSLNTR ELIQERSPMN ALNVTKHSAG NHSSMHIRKL TQERSHIYAV 60 
IVEKASFRRE ISLYISEFIL EKNPIYAMNV EKASSKRATS LFIDVLTLER NPMNAMNVGK 120 
ASARRHV 

, - SEQID N0:71 PDM8 DNA SEQUENCE 

15 Nuctelc Acid Accession ft NM.01845S 

Coding sequence- 341-955 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I ! I 1 I 

AATTTCGGCA CGGG6GGGAG GCACAGTGAG TCCACTG6GG CACGGCAGCG TCTAAGCCAC 60 

AAGCCGACTG ACATAAGCCA GGTCCTAACG GA6CCTAT6T GTAAGTCCAC TACTGGTGCA 120 

AGGTTGCACA CTTCTAAGAA GAGCGGCGTG GGGGGCTCGG CGACCTTCGC TTCAGTCGCT 180 

CCCCCGTGCA GTCCCCTGTG CCCAAGACAC AGCCT6ATGC TTGTGCTCCG GTGGGCGGAC 240 

TTGGAGGC6G CGGGAACTGC AATTGGTGGC TTTGAAGGGC GGCGAGCGGG AACAGCTCTT 300 

GAGGAGTGAG ACTGCAGGAG ATGTGGGCCG TGCCAAAGAG ATGG ATGAGA CTGTTGCTGA 360 

GTTCATCAAG AGGACCATCT TGAAAATCCC CATGAATGAA CTGACAACAA TCCTGAAGGC 420 

CTGQGATTTT TTGTCTGAAA ATCAACTGCA GACTGTAAAT TTCCGACAGA GAAAGGAATC 480 

TGTAGTTCAG CACTTGATCC ATCTGTGTGA GGAAAAGCGT GCAAGTATCA GTGATGCTGC 540 

CCTOTTAGAC ATCATTTATA TGCAATTTCA TCAGCACCAG AAAGTTTGGG ATGTTTTTCA 600 

GATGAGTAAA GGACCAGGTG AAGATGTTGA CCTTTTTGAT ATGAAACAAT TTAAAAATTC 660 

GTTCAAGAAA ATTCTTCAGA GAGCATTAAA AAATGTGACA GTCAGCTTCA GAGAAACTGA 720 

GGAGAATGCA GTCTGGATTC GAATTGCCTG GGGAACACAG TACACAAAGC CAAACCAGTA 780 

CAAACCTACC TACGTGGTGT ACTACTCCCA GACTCCGTAC GCCTTCACGT CCTCCTCCAT 840 

oc GCTGAGGCGC AATACACCGC TTCTGGGTCA GGAGTTAGAA GCTACTGGGA AAATCTACCT 900 

35 CCGACAAGAG GAGATCATTT TAGATATTAC CGAAATGAAG AAAGCTTGCA ATTAGTGAAC 960 
ATGAAAGGAA AATAAAAATT CCTCACAGTC AAAAAAAAAA AAAAA 
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SEQ ID K&72 PDMB Protein seouence: 
Protein Accession*: NP.060925 



55 



1 11 21 31 41 51 

I I I I I I 

MDETVAEFIK RTILKIPMNE LTTILKAWDP LSENQLQTVN FRQRKESWQ HLIHLCEEKR 60 

ASISDAALLD IIYKQFHQHQ KVWDVFQMSK GPGEDVDLFD MKQFKNSFKK ILQRALKNVT 120 

45 VSFRETEENA VWIRIAWGTQ YTKPNQYKPT YWYYSQTPY AFTSSSMLRR NTPLLGQELE 180 
ATGKIYLRQE EIILDITEMK RACK 

s SEQ ID N0:73 PDM9 DNA SEQUENCE 

Nucleic Add Accession*: NR.016192 
50 Coding sequence: 1-1 125 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I 1 I 

ATGGTGCTGT GGGAGTCCCC GCGGCAGTGC AGCAGCTGGA CACTTTGCGA GGGCTTTTGC 60 

TGGCTGCTGC TGCTGCCCGT CATGCTACTC ATCGTAGCCC GCCCGGTGAA GCTOGCTGCT 120 

TTCCCTACCT CCTTAAGTGA CTGCCAAACG CCCACCGGCT GGAATTGCTC TGGTTATGAT 180 

GACAGAGAAA ATGATCTCTT CCTCTGTGAC ACCAACACCT GTAAATTTGA TGGGGAATGT 240 

TTAAGAATTG GAGACACTGT GACTTGCGTC TGTCAGTTCA AGTGCAACAA TGACTATGTG 300 

CCTGTGTGTG GCTCCAATGG GGAGAGCTAC CAGAATGAGT GTTACCTGCG ACAGGCTGCA 360 

OU TGCAAACAGC AGAGTGAGAT ACTTGTGGTG TCAGAAGGAT CATGTGCCAC AGATGCAGGA 420 

TCAGGATCTG GAGATGGAGT CCATGAAGGC TCTGGAGAAA CTAGTCAAAA GGAGACATCC 480 

ACCTGTGATA TTTGCCAGTT TGGTGCAGAA TGTGACGAAG ATGCCGAGGA TGTCTGGTGT 540 

GTGTGTAATA TTGACTGTTC TCAAACCAAC TTCAATCCCC TCTGCGCTTC TGATGGGAAA 600 

^ TCTTATGATA ATGCATGCCA AATCAAAGAA GCATCGTGTC AGAAACAGGA GAAAATTGAA 660 

65 GTCATGTCTT TGGGTCGATG TCAAGATAAC ACAACTACAA CTACTAAGTC TGAAGATGGG 720 

CATTATGCAA GAACAGATTA TGCAGAGAAT GCTAACAAAT TAGAAGAAAG TGCCAGAGAA 780 

CACCACATAC CTTGTCCGGA ACATTACAAT GGCTTCTGCA TGCATGGGAA GTGTGAGCAT 840 

TCTATCAATA TGCAGGAGCC ATCTTGCAGG TGTGATGCTG GTTATACTGG ACAACACTGT 900 

GAAAAAAAGG ACTACAGTGT TCTATACGTT GTTCCCGGTC CTGTACGATT TCAGTATGTC 960 

70 TTAATCGCAG CTGTGATTGG AACAATTCAG ATTGCTGTCA TCTGTGTGGT GGTCCTCTGC 1020 

ATCACAAGGA AATGCCCCAG AAGCAACAGA ATTCACAGAC AGAAGCAAAA TACAGGGCAC 1080 
TACAGTTCAG ACAATACAAC AAGAGCGTCC ACGAGGTTAA TCTGA 
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SEQ ID NO:74 PDM9 Pmtein sequence: 
Protein Accession t: NPJM7276 

1 11 21 31 41 51 

5 | 1 | I I I 

1 MVLWESPRQC SSWTLCEGFC WLLLLPVMLL IVARPVKLAA FPTSLSDCQT PTGWNCSGYD 60 
61 DRENDLFLCD TNTCKFDGEC LRIGDTVTCV CQFKCNNDYV PVCQSNGESY QNECYLRQAA 120 
121 CKQQSEILW SEGSCATDAG SGS6D6VHEG SGETSQKETS TCDICQFGAE CDEDAEDVWC 180 
. _ 181 VCNIDCSQTN FNPLCASDGK SYDNACQIKE ASCQKQEKIE VMSLGRCQDN TTTTTKSEDG 240 

10 241 HYARTDYAEN ANKLEESARE HHIPCPBHYK GFCKHGKCEH SINMQBPSCR CDAGYTGQRC 300 

301 EKKDYSVLYV VPGPVRFQYV LIAAVIGTIQ IAVICVWLC ITRKCPRSNR IHRQKQNTGH 360 
361 YSSDNTTRAS TRLI 

SEQ ID N0:75 PD01 DMA SEQUENCE 

IS Nucleic Acid Accession #: NMJH4324 

Coding sequence: 89-1237 (unoerflned sequences correspond to start and stop codons) 

1 11 21 31 41 51 

OA I I I I • I ' I 

ZU GGCGCCGGGA TTGGGAGGGC TTCTTGCAGG CTGCTGGGCT GGGGCTAAGG GCTGCTCAGT 60 

TTCCTTCAGC GGGGCACTGG GAAGCGCC AT GG CACTGCAG GGCATCTCGG TCGTGGAGCT 120 

GTCCGGCCTG GCCCCGGGCC GTOTCTGTGC TATGGTCCTG GCTGACTTCG GGGCGCGTGT 180 

GGTACGCGTG GACCGGCCCG GCTCCCGCTA CGACGTGAGC CGCTTGGGCC GGGGCAAGCG 240 

CTCGCTAGTG CTGGACCTGA AGCAGCCGCG GGAGOCGCGT GCTGCGGCGT CTGTGCAAGC 300 

25 GGTCGGATGT GCTGCTGGAG CCCTTCCGCC GCGGTGTCAT GGAGAAACTC CAGCTGGGCC 360 

CAGAGATTCT GCAGCGGGAA AATCCAAGGC TTATTTATGC CAGGCTGAGT GGATTTGGCC 620 

AGTTCAGGAA AGCTTCTGCC GGTTAGCTGG CCACGATATC AACTATTTGG CTTTGTCAGG 480 

TGTTCTCTCA AAAATTGGCA GAAGTGGTGA GAATCCGTAT GCCCCGCTGA ATCTCGTGGC 540 

TCACTTTGCT GGTGGTGGCC TTATGTGTGC ACTGGGCATT ATAATGGCTC TTTTTGACCG 600 

30 CACACGCACT GACAAGGGTC AGGTCATTGA TGCAAATATG GTGGAAGGAA CAGCATATTT 660 

AAGTTCTTTT CTGTGGAAAA CTCAGAAATC GAGTCTGTGG GAAGCACCTC GAGGACAGAA 720 

CATGTTGGAT GGTGGAGCAC CTPTCTATAC GACTTACAGG ACAGCAGATG GGGAATTCAT 780 

GGCTGTTGGA GCAATAGAAC CCCAGTTCTA CGAGCTGCTG ATCAAAGGAC TTGGACTAAA 840 

GTCTGATGAA CTTCCCAATC AGATGAGCAC GGATGATTGG CCAGAAATGA AGAAGAAGTT 900 

35 TGCAGATGTA TTTGCAAAGA A6A0GAAGGC AGAGTGGTGT CAAATCTTTG ACGGCACAGA 960 

TGCCTGTGTG ACTCCGGTTC TGACTTTTGA GGAGGTTGTT CATCATGATC ACAACAAGGA 1020 

ACGGGGCTCG TTTATCACCA GTGAGGAGCA GGACGTGAGC CCCCGCCTTG CACCTCTGCT 1080 

GTTAAACACC CCAGCCATCC CTTCTTCCAA AGGGGATCCT TTCATAGGAG AACACACTGA 1140 

rt GGAGATACTT gaagaatttg GATTCAGCCG AGAAGAGATT TATCAGCTTA ACTCAGATAA 1200 

40 AATCATTGAA AGTAATAAGG TAAAAGCTAG TCTCTAACTT CCAGGCCCAC GGCTCAAGTG 1260 

AATTTGAATA CTGCATTTAC AGTGTAGAGT AACACATAAC ATTGTATGCA TGGAAACATG 1320 

GAGGAACAGT ATTACAGTGT CCTACCACTC TAATCAAGAA AAGAATTACA GACTCTGATT 1380 

CTACAGTGAT GATTGAATTC TAAAAATGGT TATCATTAGG GCTTTTGATT TATAAAACTT 1440 

AC TGGGTACTTA TACTAAATTA TGGTAGTTAT TCTGCCTTCC AGTTTGCTTG ATATATTTGT 1500 

45 TGATATTAAG ATTCTTGACT TATATTTTGA ATGGGTTCTA GTGAAAAAGG AATGATATAT 1560 

TCTTGAAGAC ATCGATATAC ATTTATTTAC ACTCTTGATT CTACAATGTA GAAAATGAGG 1620 

AAATGCCACA AATTGTATGG TGATAAAAGT CACGTGAAAC AGAGTGATTG GTTGCATCCA 1680 

C GC CT T TTO T CTTGGTGTTC ATGATCTCCC TCTAAGCACA TTCCAAACTT TAGCAACAGT 1740 

cf ^ TATCACACTT TGTAATTTGC AAAGAAAAGT TTCACCTGTA TTGAATCAGA ATGCCTTCAA 1800 

50 CTGAAAAAAA CATATCCAAA ATAATGAGGA AATGTGTTGG CTCACTACGT AGAGTCCAGA 1860 

GGGACAGTCA GTTTTAGGGT TGCCTGTATC CAGTAACTCG GGGCCTGTTT CCCCGTGGGT 1920 

CTCTGGGCTG TCAGCTTTCC TTTCTCCATG TGTTTGATTT CTCCTCAGGC TGGTAGCAAG 1980 

TTCTGGATCT TATACCCAAC ACACAGCAAC ATCCAGAAAT AAAGATCTCA GGACCCCCCA 2040 

55 

SEQIDN0:76£ 
Protein Accession * NP.055I39 

1 11 21 31 ' 41 51 

60 | I.I I I I 

1 MALQGISWE LSGLAPGRXC AMVLADFGAR WRVDRPGSR YUVSRLGRGK RSLVLDLKQP 60 
61 REPRAAASVQ AVGCAAGALP PRCHGETPAG PRDSAAGKSK AYLCQAEWIW FVQESFCRLA 120 
121 GRDXNYLALS GVLSKIGRSG ENPYAPLNLV ADPAGGGLHC ALGIIHALFD RTRTDKGQVT 180 
181 DANMVEGTAY LSSFLWKTQK SSLWEAPRGQ NMLDGGAPFY TTYRTADGEF MAVGAIEPQF 240 
65 241 YELLIKGLGL KSDELPNQMS TDDWPEMKKK FADVFAKKTK AEWCQIFDGT DACVTPVLTF 300 

301 EEWHHDHNK ERGSFITSEE QDVSPRLAPL LLNTPAIPSS KGDPFIGEHT EEILEEFGFS 360 
361 REEIYQLNSD KIIESNKVKA SL 

SEQ ED N0:77 PD03 DNA SEQUENCE 

70 Nucleic Add Accession*: AB028951 

Coding sequence: 97-11 28 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

„ | | I I i I 

75 GTTAAATCCT TACTTTACCA GATTCTTGAT GGTATCCATT ACCTCCATGC AAATTGGGTG 60 

CTTCACAGAG ACTTGAAACC AGCAAATATC CTAGTAATGG GAGAAGGTCC TGAGAGGGGG 120 

AGAGTCAAAA TAGCTGACAT GGGTTTTGCC AGATTATTCA ATTCTCCTCT AAAGCCACTA 180 

GCAGATTTGG ATCCAGTAGT TGTGACATTT TGGTATCGGG CTCCAGAACT TTTGCTTGGT 240 

GCAAGGCATT ATACAAAGGC CATTGATATA TGGGCAATAG GTTGTATATT TGCTGAATTG 300 

80 TTGACTTCGG AACCTATTTT TCACTGTCGT CAGGAAGATA TAAAAACAAG CAATCCCTTT 360 
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CATCATGATC AACTGGATCG GATATTTAGT GTCATGGGGT TTCCTGCAGA TAAAGACTGG 420 
GAAGATATTA GAAAGATGCC AGAATATCCC ACACTTCAAA AAGACTTTAG AAGAACAACG 480 
TATGCCAACA GTAGCCTCAT AAAGTACATG GAGAAACACA AGGTCAAGCC TGACAGCAAA 540 
GTGTTCCTCT TGCTTCAGAA ACTCCTGACC ATGGATCCAA CCAAGAGAAT TACCTCGGAG 600 
5 CAAGCTCTGC AGGATCCCTA TTTTCAGGAG GACCCTTTGC CAACATTAGA TGTATTTGCC 660 
GGCTGCCAGA TTCCATACCC CAAAOGAGAA TTCCTTAATG AAGATGATCC TGAAGAAAAA 720 
GGTGACAAGA ATCAGCAACA GCAGCAGAAC CAGCATCAGC AGCCCACAGC CCCTCCACAG 780 
CAGGCAGCAG CCCCTCCACA GGCGCCCCCA CCACAGCAGA ACAGCACCCA GACCAACGGG 840 
ACOGCAGGTG GGGCTGGGGC CGGGGTCGGG GGCACCGGAG CAGGGTTGCA GCACAGCCAG 900 

10 GACTCCAGCC TGAACCAGGT GCCTCCAAAC AAGAAGCCAC GGCTAGGGCC TTCAGGCGCA 960 
AACTCAGGTG GACCTGTGAT GCCCTCGGAT TATCAGCACT OCAGTTCTCG CCTGAATTAC 1020 
CAAAGCAGCG TTCAGGGATC CTCTCAGTCC CAGAGCACAC TTGGCTACTC TTCCTCGTCT 1080 
CAGCAGAGCT CACAGTACCA CCCATCTCAC CAGGCCCACC GGTACTGACC AGCTCCCGTT 1140 
GGGCCAGGCC AGCCCAGCCC AGAGCACAGG CTCCAGCAAT ATGTCTGCAT TGAAAAGAAC 1200 

15 CAAAAAAATG CAAACTATGA TGCCATTTAA AACTCATACA CATGGGAGGA AAACCTTATA 1260 
TACTGAGCAT TGTGCAGGAC TGATAGCTCT TCTTTATTGA CTTAAAGAAG ATTCTTGTGA 1320 
AGTTTCCCCA GCACCCCTTC CCTGCATGTG TTCCATTGTG ACTTCTCTGA TAAAGCGTCT 1380 
GATCTAATCC CAGCACTTCT GTAACCTTCA GCATTTCTTT GAAGGATTTC CTGGTGCACC 1440 
TTTCTCATGC TGTAGCAATC ACTATGGTTT ATCTTTTCAA AGCTCTTTTA ATAGGATTTT 1500 

20 AATGTTTTAG AAACAGGATT CCAGTGGTGT ATAGTTTTAT ACTTCATGAA CTGATTTAGC 1560 
AACACAGGTA AAAATGCACC TTTTAAAGCA CTACGTTTTC ACAGACAATA ACTGTTCTGC 1620 
TCATGGAAGT CTTAAACAGA AACTGTTACT GTCCCAAAGT ACTTTACTAT TACGTTCGTA 1680 
TTTATCTAGT TTCAGGGAAG GTCTAATAAA AAGACAAGCG GTGGGACAGA GGGAACCTAC 1740 
AACCAAAAAC TGCCTAGATC TTTGCAGTTA TGTGCTTTAT GCCACGAAGA ACTGAAGTAT 1800 

25 GTGGTAATTT TTATAGAATC ATTCATATGG AACTGAGTTC CCAGCATCAT CTTATTCTGA 1860 
ATAGCATTCA GTAATTAAGA ATTACAATTT TAACCTTCAT GTAGCTAAGT CTACCTTAAA 1920 
AAGGGTTTCA AGAGCTTTGT ACAGTCTCGA TGGCCCACAC CAAAACGCTG AAGAGAGTAA 1980 
CAACTGCACT AGGATTTCTG TAAGGAGTAA TTTTGATCAA AAGACGTGTT ACTTCCCTTT 2040 
GAAGGAAAAG TTTTTAGTGT GTATTGTACA TAAAGTCGGC TTCTCTAAAG AACCATTGGT 2100 

30 TTCTTCACAT CTGGGTCTGC GTGAGTAACT TTCTTGCATA ATCAAGGTTA CTCAAGTAGA 2160 
AGCCTGAAAA TTAATCTGCT TTTAAAATAA AGAGCAGTGT TCTCCATTCG TATTTGTATT 2220 
AGATAXAGAG TGACTATTTT TAAAGCATGT TAAAAATTTA GGTTTTATTC ATGTTTAAAG 2280 
TATGTATTAT GTATGCATAA TTTTGCTGTT GTTACTGAAA CTTAATTCTA TCAAGAATCT 2340 
TTTTCATTGC ACTGAATGAT T T CTTTTGCC CCTAGGAGAA AACTTAATAA TTGTGCCTAA 2400 

35 AAACTATGGG CGGATAGTAT AAGACTATAC TAGACAAAGT GAATATTTGC ATTTCCATTA 2460 
TCTATGAATT AGTGGCTGAG TTCTTTCTTA GCTGCTTTAA GGAGCCCCTC ACTCCCCAGA 2520 
GTCAAAAGGA AATGTAAAAA CTTAGAGCTC CCATTGTAAT GTAAGGGGCA AGAAATTTGT 2580 
GTTCTTCTGA ATGCTACTAG CAGCACCAGC CTTGTTTTAA ATGTTTTCTT GAGCTAGAAG 2640 
AAATAGCTGA TTATTGTATA TGCAAATTAC ATGCATTTTT AAAAACTATT CTTTCTGAAC 2700 

40 TTATCTACCT GGTTATGATA CTGTGGGTCC ATACACAAGT AAAATAAGAT TAGACAGAAG 2760 
CCAGTATACA TTTTGCACTA TTGATGTGAT ACTGTAGCCA GCCAGGACCT TACTGATCTC 2820 
AGCATAATAA TGCTCACTAA TAATGAAGTC TGCATAGTGA CACTCATCAA GACTGAAGAT 2680 
GAAGCAGGTT ACGTGCTCCA TTGGAAGGAG TTTCTGATAG TCTCCTGCTG TTTTACCCCT 2940 
TCCATTTTTT AAAATAAGAA ATTAGCAGCC CTCTGCATAA TGTAGCTGCC TATATGCAGT 3000 

45 TTTATCCTGT. GCCCTAAAGC CTCACTGTCC AGAGCTGTTG GTCATCAGAT GCTTATTGCA 3060 
CCCTCACCAT GTGCCTGGTG CCCTGCTGGG TAGAGAACAC AGAGGACAGG GCATACTTCT 3120 
TGTCCTTAAG GAGCTTGTGA TCTGTGACAG TAAGCCCTCC TGGGATGTCT GTGCCATGTG 3180 
ATTGACTTAC AAGTGAAACT GTCTTATAAT ATGAAGGTCT TTTTGTTTAC TTCTAAAOCC 3240 
ACTTGGGTAG TTACTATCCC CAAATCTGTT CTGTAAATAA TATTATGGAA GGGTTTCTAT 3300 

50 GTCAGTCTAC CTTAGAGAAA GCCAGTGATT CAATATCACA AAAGGCATTG ACGTATCTTT 3360 
GAAATGTTCA CAGCAGCCTT TTAACAACAA CTGGGTGGTC CTTGTAGGCA GAACATACTC 3420 
TCCTAAGTGG TTGTAGGAAA TTGCAAGGAA AATAGAAGGT CTGTTCTTGC TCTCAAGGAG 3480 
GTTACCTTTA ATAAAAGAAG ACAAACCCAG ATAGATATGT AAACCAAAAT ACTATGCCCC 3540 
TTAATACTTT ATAAGCAGCA TTGTTAAATA GTTCTTACGC TTATACATTC ACAGAACTAC 3600 

55 OCTGTTTTCC TTGTATATAA TGACTTTTGC TGGCAGAACT GAAATATAAA CTGTAAGGGG 3660 
ATTTCGTCAG TTGCTCCCAG TATACAATAT CCTCCAGGAC ATAGCCAGAA ATCTCCATTC 3720 
CACACATGAC TGAGTTCCTA TCCCTGCACT GGTACTGGCT CTTTTCTCCT CTTTCCTTGC 3780 
CTCAGGGTTC GTGCTACCCA CTGATTCCCT TTACCCTTAG TAATAATTTT GGATCATTTT 3840 
CTTTCCTTTA AAGGGGAACA AAGCCTTTTT TTTTTTT GAG ACGGAGTGTT GCTCTGTCAC 3900 

60 CCAAGCTGGA GTGCAGTGGC ACGATCTTGG CTCACTCCAA CCTCCACCTT CCAGGTTCAA 3960 
GTGATTCTCC TGCCTCAGCC TCCCGAGTAG CTGGGACTAC GGGCACGCAC CACCACGTCT 4020 
GGCTAATTTT TGTATTTTTA GTAGAGATGG GGTTTCACCC TATTGGTCAG GCTGGTCTTG 4080 
AATTCCTCAC CTCAGGTCAT CCGCCTGTCT CGGCCTOCCG AAGTGCTGGG ATTATAGGTG 4140 

- TGAGCCACCG CACCCAGTTG GGAACAAAGC CTTTTTAACA CACGTAAGGG CCCTCAAACC 4200 

65 GTGGGACCTC TAAGGAGACC TTTGAAGCTT TTTGAGGGCA AACTTTACCT TTGTGGTCCC 4260 

CAAATGATGG CATTTCTCTT TGAAATTTAT TAGATACTGT TATGTCCCCC AAGGGTACAG 4320 
GAGGGGCATC CCTCAGCCTA TGGGAACACC CAAACTAGGA GGGGTTATTG ACAGGAAGGA 4380 
ATGAATCCAA GTGAAGGCTT TCTGCTCTTC GTGTTACAAA CCAGTTTCAG AGTTAGCTTT 4440 
CTGGGGAGGT GTGTGTTTGT GAAAGGAATT CAAGTGTTGC AGGACAGATG AGCTCAAGGT 4500 

70 AAGGTAGCTT TGGCAGCAGG GCTGATACTA TGAGGCTGAA ACAATCCTTG TGATGAAGTA 4560 

GATCATGCAG TGACATACAA AGACCAAGGA TTATGTATAT TTTTATATCT CTGTGGTTTT 4620 
GAAACTTTAG TACTTAGAAT TTTGGCCTTC TGCACTACTC TTTTGCTCTT ACGAACATAA 4680 
TGGACTCTTA AGAATGGAAA GGGATGACAT TTACCTATGT GTGCTGCCTC ATTCCTGGTG 4740 
AAGCAACTGC TACTTGTTCT CTATGCCTCT AAAATGATGC TGTTTTCTCT GCTAAAGGTA 4800 

75 AAAGAAAAGA AAAAAATAGT TGGAAAATAA GACATGCAAC TTGATGTGCT TTTGAGTAAA 4860 

TTTATGCAGC AGAAACTATA CAATGAAGGA AGAATTCTAT GGAAATTACA AATCCAAAAC 4920 
TCTATGATGA TGTCTTCCTA GGGAGTAGAG AAAGGCAGTG AAATGGCAGT TAGACCAACA 4980 
GAGGCTTGAA GGATTCAAGT ACAAGTAATA TTTTGTATAA AACATAGCAG TTTAGGTCCC 5040 

ft CATAATCCTC AAAAATAGTC ACAAATATAA CAAAGTTCAT TGTTTTAGGG TTTTTAAAAA 5100 

80 ACGTGTTGTA CCTAAGGCCA TACTTACTCT TCTATGCTAT CACTGCAAAG GGGTGATATG 5160 
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TATGTATTAT ATAAAAAAAA AAACCCTTAA TGCACTGTTA TCTCCTAAAT ATTTAGTAAA 5220 

TTAATACTAT TTAATTTTTT TAAAGATTTG TCTGTGTAGA CACTAAAAGT ATTACACAAA 5280 

ATCTGOACTG AAGGTGTCCT TTTTAACAAC AATTTAAAGT ACTTTTTATA TATGTTATGT 5340 

AGTATATCCT TTCTAAACTG CCTAGTTTGT ATATTCCTAT AATTCCTATT TGTGAAGTGT 5400 

ACCTGTTCTT GTCTCTTTTT TCAGTCATTT TCTGCACGCA TCCCCCTTTA TATGGTTATA 5460 

GAGATGACTG TAGCTTTTCG TGCTCCACTG CGAGGTTTGT GCTCAGAGCC GCTGCACCCC 5520 

AGCGAGGCCT GCTOCATCGA GTGCAGGACG AGCTACTGCT TTGGAGCGAG G 6TT TOCTQC 5580 

TTTTGAGTTG ACCTGACTTC CTTCTTGAAA TGACTGTTAA AACTAAAATA AATTACATTG 5640 
CATTTATTTT ATATTCTTGG TTGAAATAAA ATTTAATTGA CTTTG 

SEQ ID NO:7BPD03 Protein seouence: 
Protein Accession* BAA82980 



1 11 21 31 41 51 

15 | | I i I I 

VKSLLYQILD GIHYLHANWV LHRDLKPANI LVKGEGPERG RVKIADMGFA RLFNSPLKPL 60 
ADLDPVWTF WYRAPELLLG ARHYTKAIDI WAIGCIFAEL LTSEPIFBCR QEDIKTSNPF 120 
HHDQLDRIFS VKGFPADKDW EDIRKMPEYP TLQKDFRRTT YANSSLIKYM EKHKVKPDSK 180 
VFLLLQKLLT MDPTKRITSE QALQDPYFQE DPLPTLDVFA GCQIFYFKRE FLNEDDPEBK 240 
20 GDKHQQQQQN QHQQPTAPPQ QAAAPPQAPP PQQNSTQTNG TAGGAGAGVG GTGAGLQHSQ 300 

DSSLNQVPFN KKPRLGPSGA NSGGPVHPSD YQHSSSRLNY QSSVQGSSQS QSTLGYSSSS 360 
QQSSQYHPSH QAHRY 

SEQ ID NO:79 PD05 DNA SEQUENCE 

25 Nudefc Acid Accession*: XML002922 

Coding sequence: 1-21 9 Q (undefined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

aA I I I I I I 

30 ATGAATCCTT TCCAGAAAAA TGAGTCCAAG GAAACTCTTT TTTCACCTGT CTCCATTGAA 60 

GAGGTACCAC CTCGACCACC TAGCCCTCCA AAGAAGCCAT CTCCGACAAT CTGTGGCTCC 120 

AACTATCCAC TGAGCATTGC CTTCATTGTG GTGAATGAAT TCTGCGAGCG CTTTTCCTAT 180 

TATGGAATGA AAGCTGTGCT GATCCTGTAT TTCCTGTATT TCCTGCACTG GAATGAAGAT 240 

ACCTCCACAT CTATATACCA TGCCTTCAGC AGCCTCTGTT ATTTTACTCC CATCCTGGGA 300 

35 GCAGCCATTG CTGACTCGTG GTTGGGAAAA TTCAAGACAA TCATCTATCT CTCCTTGGTG 360 

TATGTGCTTG GCCATGTGAT CAAGTCCTTG GGTGCCTTAC CAATACTGGG AGGACAAGTG 420 

GTACACACAG TCCTATCATT GATCGGCCTG AGTCTAATAG CTTTGGGGAC AGGAGGCATC 480 

AAACCCTGTG TGGCAGCTTT TGGTGGAGAC CAGTTTGAAG AAAAACATGC AGAGGAACGG 540 

ACTAGATACT TCTCAGTCTT CTACCTGTCC ATCAATGCAG GGAGCTTGAT TTCTACATTT 600 

40 ATCACACCCA TGCTGAGAGG AGATGTGCAA TGTTTTGGAG AAGACTGCTA TGCATTGGCT 660 

TTTGGAGTTC CAGGACTGCT CATGGTAATT GCACTTGTTG TGTTTGCAAT GGGAAGCAAA 720 

ATATACAATA AACCACCCCC TGAAGGAAAC ATAGTGGCTC AAGTTTTCAA ATGTATCTGG 780 

TTTGCTATTT CCAATCGTTT CAAGAACCGT TCTGGAGACA TTCCAAAGCG ACAGCACTGG 840 

CTAGACTGGG CAGCTGAGAA ATATCCAAAG CAGCTCATTA TGGATGTAAA GGCACTGACC 900 

45 AGGGTACTAT TCCTTTATAT CCCATTGCCC ATGTTCTGGG CTCTTTTGGA TCAGCAGGGT 960 

TCACGATGGA CTTTGCAAGC CATCAGGATG AATAGGAATT TGGGGTTTTT TGTGCTTCAG 1020 

CCGGACCAGA TGCAGGTTCT AAATCCCTTT CTGGTTCTTA TCTTCATCCC GTTGTTTGAC 1080 

TTTGTCATTT ATCGTCTGGT CTCCAAGTGT GGAATTAACT TCTCATCACT TAGGAAAATG 1140 

GCTGTTGGTA TGATCCTAGC GTGCCTGGCA TTTGCAGTTG CGGCAGCTGT AGAGATAAAA 1200 

50 ATAAATGAAA TGGCCCCAGC CCAGTCAGGT CCCCAGGAGG TTTTCCTACA AGTCTTGAAT 1260 

CTGGCAGATG ATGAGGTGAA GGTGACAGTG GTGGGAAATG AAAACAATTC TCTGTTGATA 1320 

GAGTCCATCA AATCCTTTCA GAAAACACCA CACTATTCCA AACTGCACCT GAAAACAAAA 1380 

AGCCAGGATT TTCACTTCCA CCTGAAATAT CACAATTTGT CTCTCTACAC TGAGCATTCT 1440 

GTGCAGGAGA AGAACTGGTA CAGTCTTGTC ATTCGTGAAG A«PGGGAACAG TATCTCCAGC 1500 

55 ATGATGGTAA AGGATACAGA AAGCAAAACA ACCAATGGGA TGACAACCGT GAGGTTTGTT 1560 

AACACTTTGC ATAAAGATGT CAACATCTCC CTGAGTACAG ATACCTCTCT CAATGTTGGT 1620 

GAAGACTATG GTGTGTCTGC TTATAGAACT GTGCAAAGAG GAGAATACCC TGCAGTGCAC 1680 

TGTAGAACAG AAGAXAAGAA CTTTTCTCTG AATTTGGGTC TTCTAGACTT TGGTGCAGCA 1740 

TATCTGTTTG TTATTACTAA TAACACCAAT CAGGGTCTTC AGGCCTGGAA GATTGAAGAC 1800 

60 ATTCCAGCCA ACAAAATGTC CATTGCGTGG CAGCTACCAC AATATGCCCT GGTTACAGCT 1860 

GGGGAGGTCA TGTTCTCTGT CACAGGTCTT GAGTTTTCTT ATTCTCAGGC TCCCTCTAGC 1920 

ATGAAATCTG TGCTCCAGGC AGCTTGGCTA TTGACAATTG CAGTTGGGAA TATCATCGTG 1980 

CTTGTTGTGG CACAGTTCAG TGGCCTGGTA CAGTGGGCCG AATTCATTTT GTTTTCCTGC 2040 

CTCCTGCTGG TGATCTGCCT GATCTTCTCC ATCATGGGCT ACTACTATGT TCCTGTAAAG 2100 

65 ACAGAGGATA TGCGGGGTCC AGCAGATAAG CACATTOCTC ACATCCAGGG GAACATGATC 2160 
AAACTAGAGA CCAAGAAGAC AAAACTCTGA 



SEQIDNO:80£ 

Protein Accession*: XP.002922 



11 21 31 41 51 

I I I I I I 

HNPFQKNESK ETLFSFVSIE EVPPRPPSPP KKPSPTICGS NYPLSIAPIV VNEFCERPSY 60 

YGMKAVLILY PLYFLHWNED TSTSIYHAPS SLCYPTPII/G AAIADSWLGK FKTIIYLSLV 120 

75 YVLGHVIKSL GALPILGGQV VHTVLSLIGL SLIALGTGGI KPCVAAFGGD QPEEKHAEER 180 

TRYFSVFYLS INAGSLISTF ITPHLRGDVQ CFGEDCYALA FGVPGLLMVI ALWFAMGSK 240 

IYNKPPPEGN IVAQVPKCIW FAISNRFKKR SGDIPKRQHW LDWAAEKYFK QLZMDVKALT 300 

RVLFLYIPLP MFWALLDQQG SRWTLQAIHM NRNLGFPVLQ PDQMQVLNPF LVLIFIPLFD 360 

PVIYRLVSKC GINFSSLRKM AVGMILACLA PAVAAAVEIK INEMAPAQSG PQEVFLQVI2J 420 

80 LADDEVKVTV VGNENNSLLI ESIKSFQKTP HYSKLHLKTK SQDFHFHLKY HNLSLYTEHS 480 
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VQEKNWYSLV IREDGNSISS MMVKDTESKT 1NGMTTVRFV NTLHKDVNIS LSTDTSLNVG 540 

EDYGVSAYRT VQRGEYPAVB CRTEDKNFSL NLGLLDFGAA YLFVITONTO QGLQAWKIED 600 

IPANKMSIAW QLPQYALVTA GBVMFSVTGL EPSYSQAPSS HKSVLQAAWL LTIAVGNIIV 660 

LVVAQFSGLV QWAEFILFSC LLLVICLIFS IKGYYYVFVK TEDMRGPADK RIPHIQGNMI 720 
KLETKKTKL 

SEQ ID N0:81 PD06 DMA SEQUENCE 

Nucleic Add Accession #: NM_020448 

Coding sequence: 1-1221 {underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I 1 I I I I 

ATQGACGGAT CCCACA6CGC AGCCCTGAAG CTGCAGCAGC TGCCTCCCAC AAGTAGCTCC 60 

AGCGCCGTAA GCGAGGCCTC CTTCTCCTAC AAGGAAAACC TGATTGGCGC CCTCTTGGCG 120 

ATCTTCG6GC ACCTCGTOGT CA6CATTGCA CTTAACCTCC AGAAGTACTG CCACATCCGC 180 

CTGGCAGGCT CCAAGGATCC CCGGGCCTAT TTCAAGACCA AGACATGGTG GCTGGGCCTG 240 

' TTGCTGATGC TTCTGGGCGA GCTGGGTGTG TTCGCCTCCT ACGCCTTOGC GCCGCTGTCA 300 

CTCATCGTGC CCCTCAGCGC AGTTTCTGTG. ATAGCTAGTG CCATCATAGG AATCATATTC 360 

ATCAAGGAAA AGTGGAAACC GAAAGACTTT CTGAGGCGCT ACGTCTTGtC CTTTGTTGGC 420 

TGCGGTTTGG CTGTCGTGGG TACCTACCTG CTGGTGACAT TCGCACCCAA CAGTCACGAG 480 

AAGATGACAG GCGAGAATGT CACCAGGCAC CTCGTGAGCT GGCCTTTCCT TTTGTACATG 540 

CTGGTGGAGA TCATTCTGTT CTGCTTGCTG CTCTACTTCT ACAAGGAGAA GAACGCCAAC 600 

AACATTGTCG TGATTCTTCT CTTGGTGGCG TTACTTGGCT CCATGACAGT GGTGACAGTC 660 

AAGGOCGTGG CTGGGATGCT TGTCTTGTCC ATTCAAGGGA ACCTCCAGCT TGACTACCCC 720 

ATCTTCTACG TGATGTTCGT GTGCATGGTG GCAACCGCCG TCTATCAGGC TGCGTTTTTG 780 

AGTCAAGCCP CACAGATGTA CGACTCCTCT TTGATTGOCA GTGTGGGCTA CATTCTGTCC 840 

ACAACCATTG CTATCACAGC AGGTGCAATA TTTTACCTGG ACTTCATCGG GGAGGACGTG 900 

CTGCACATCT GCATGTTTGC ACTGGGGTGC CTCATTGCAT TCTTGGGCGT CTTCTTAATC 960 

ACGOGTAACA GGAAGAAGCC CATTCCATTT GAGCCCTATA TTTCCATGGA TGCCATGCCA 1020 

GGTATGCAGA ACATGCACGA TAAAGGGATG ACTGTCCAGC CTGAACTTAA AGCTTCTTTT 1080 

TCCTATGGGG CTCTGGAAAA CAATGACAAC ATTTCTGAGA TCTACGCTCC TGCCACCCTG 1140 

CCAGTCATGC AAGAAGAGCA CGGCTCCAGA AGTGCCTCTG GGGTCCCCTA CCGAGTCCTA 1200 
GAGCACACCA AGAAGGAATGA 



35 SEQIDN0:B2E 

Protein Accession*: NPJJ65181 



l 11 21 31 41 51 

! I 1 I 1 I 

HDG5BSAALK LQQLPPTSSS SAVSEASFSY KENLIGALLA IFGHLWSIA LNLQKYCHHl 60 

LAGSKDPRAY FKTKTWWLGL FLMLLGELGV FASYAFAPLS LIVPLSAVSV IASAIIGZIF 120 

IKEKWKPKDF LRRYVLSFVG CGIAWGTO, LVTFAPNSHE KMTGENVTRH LVSWPFLLYM 180 

LVEIILFCLL LYFYKEKNAN NIWILLLVA LLGSKTWTV KAVAGMLVLS IQGNLQLDYP 240 

IFYVMPVCMV ATAVYQAAFL SCASQMYDSS LIASVGYILS TTIAITAGAI FYLDPIGEBV 300 

45 LHICMFALGC LIAFLGVPLI TRNRKKPIPF BPYISHUAMP GMQNMHDKGM TVQPBLKASF 360 
SYGALENNDN ISEIYAPATL PVMQEEHGSR SASGVPYKVL EHTKKE 

SEQ ID NO:83 POOS DNA SEQUENCE 

Nucleic AcM Accession* NM.032712 
50 Codno sequence: 555-80B (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

11) III 

CACTCATTAA GAACAGAGGA GGCTGCCTGT TACTCCTGGT GTTGCATCCC TCCAGACACT 60 

55 CTGCTGTTTC CTGCCTAGGC GTGGCTGCAG CCATGGCTAG GAAAGCGCTG CCACCCACCC 120 

ACCTGGGCCA GAGCTGGTTC TGCTCCTGCT GCAGGGACAC TGAGCTGGCT ATCTCGGCGC 180 

TTCGGGCAAG AACTGCAACA GGCTCTCCTG GGTCCTGCAG GTGTACAGCC GGGCCCCTGC 240 

CTlVJXfl X TC AGCTCTCGAG AGCTGCTGC T GCCGGGTGAC CTGATCCAAC CTGATAAGGT 300 

, n GCCATCTTCA GCTACCACTG CAAGGCOCTG AGGGCAACAG CAGCACGGCA CTGCOCACCC 360 

UU GGCTGCTGAT GGCCTGGTGC CAGCTGGGAG TCCTCCCGGC ACTTCGAGGC CACTGAGCCA 420 

CCCTTCCAGC CCCAGCCCAC CATGGACAGG GGTATCCAGC TTCCTCCTCA ACCTCGTCCT 480 

CTGCCCCTGA GCCAGTGACG CCCAAGGACA TGCCTGTTAC CCAGGTCCTG TACCAGCACT 540 

AGCTGGTCAA GGGCATGACA GTGCTGGAGG CCGTCTTGGA GATCCAGGCC ATCACTGGCA 600 

GCAGGCTGCT CTCCATGGTG CCAGGGCCCG CCAGGCCACC AGGCTCATGC TGGGACCCAA 660 

CCCAGTGCAC AAGGACTTGG CTGCTGAGCC ACACACCCAG GAGAAGGTGG ATAAGTGGGC 720 

TACCAAGGGC TTCCTGCAGG CTAGGGGAGG AGCCACCCCC GCTTCCCTAT TGTGACCAGG 780 

CCTATGGGGA GGAGCTGTCC ATACGCCACC GTGAGACCTG GGCCTGGCTC TCAAGGACAG 840 

ACACCGCCTG GCCTGGTGCT OCAGGGGTGA AGCAGGCCAG AATCCTGGGG GAGCTGCTCC 900 

TGGTTTGAGC TGCATTCAGG AAGTGCGGGA CATGGTAGGG GAGGCAAAAA GCCTTGGGCA 960 

70 CTACCCTCCC TGTGGAGCTG TTCGGTGTCC GTCGAGCTAG CCACACCCTG ACACCATGTT 1020 

CAAGGGTACC GGAAGAGAAG GGTGTCTGCC CCCAACCTCC CCTGTGGGTG TCACTGGCCA 1080 

GATGTCATGA GGGAAGCAGG GCTTGTGAGT GGACACTGAC CATGAGTCCC TGGGGGGAGT 1140 

GATCCCCCAG GCATCGTGTG CCATCTTGCA CTTCTGCCCA GGCAGCAGGG TGGGTGGGTA 1200 

CCATGGGTGC CCACCCCTCC ACCACATGGG GCCCCAAAGC ACTGCAGGCC AAGCAGGGCA 1260 
ACCCCACACC CTTGACATAA AAGCATCTTG AAGCTTTTAA AAAAAAAAAA AAAAAA 

SEQ m mM PDQB Protein fi«?uence 
Protein Accession*: NP.U6101 

80 l 11 21 31 41 51 
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I II I I I 

MTVLEAVLEI QAITGSRLLS MVPGPARPPG SCWDFTQCTR TWLLSHTPRR RHISGLPRAS 60 
CRLGEEPPPL PYCDQAYGEE LSIRHRETWA WLSRTDTAWP GAPGVKQARI LGELLLV 

SEQ ID N<fc85 POT1 DNA SEQUENCE 

Nucleic Acid Accession #: NM.000693 

Coding sequence: 53-1591 (underlined sequences correspond to start and stop codons) 
1 11 21 31 41 51 

i I I I i I 

AGCCGGTGCG CCGCAGACTA GGGCGCCTCG GGCCAGGGAG CGCGGAGGAG CCATGGCCAC 60 

CGCTAACGGG GCCGTGGAAA ACGGGCAGCC GGACGGGAAG CCGCCGGCCC TGCCGCGCCC 120 

CATCCGCAAC CTGGAGGTCA AGTTCACCAA GATATTTATC AACAATGAAT GGCACGAATC 160 

CAAGAGTGGG AAAAAGTTTG CTACATGTAA OCCTTCAACT CGGGAGCAAA TATGTGAAGT 240 

GGAAGAAGGA GATAAGCCCG ACGTGGACAA GGCTGTGGAG GCTGCACAGG TTGCCTTCCA 300 

GAGGGGCTCG CCATGGCGCC GGCTGGATGC CCTGAGTCGT GGGCGGCTGC TGCACCAGCT 360 

GGCTGACCTG GTGGAGAGGG ACCGCGCCAC CTTGGCCGCC CTGGAGACGA TGGATACAGG 420 

GAAGCCATTT CTTCATGCTT TTTTCATCGA CCTGGAGGGC TGTATTAGAA CCCTCAGATA 480 

CTTTGCAGGG TGGGCAGACA AAATCCAGGG CAAGACCATC CCCACAGATG ACAACGTCGT 540 

ATGCTTCACC AGGCATGAGC CCATTGGTGT CTOTGGGGCC ATCACTCCAT GGAACTTCCC 600 

CCTGCTGATG CTGGTGTGGA AGCTGGCACC CGCCCTCTGC TGTGGGAACA CCATGGTCCT 660 

GAAGCCTGCG GAGCAGACAC CTCTCACCGC CCTTTATCTC GGCTCTCTGA TCAAAGAGGC 720 

CGGGTOCCCT CCAGGAGTGG TGAACATTGT GCCAGGATTC GGGCCCACAG TGGGAGCAGC 780 

AATTTCTTCT CACCCTCAGA TCAACAAGAT CGCCTTCACC GGCTCCACAG AGGTTGGAAA 840 

ACTGGTTAAA GAAGCTGCGT CCCGGAGCAA TCTGAAGCGG GTGACGCTGG AGCTGGGGGG 900 

GAAGAACCCC TGCATCGTGT GTGC6GACGC TGACTTGGAC TTGGCAGTGG AGTGTGCCCA 960 

TCAGGGAGTG TTCTTCAACC AAGGCCAGTG TTGCACGGCA GCCTCCAGGG TGTTCGTGGA 1020 

GGAGCAGGTC TACTCTGAGT TTGTCAGGCG GAGCGTGGAG TATGCCAAGA AACGGCCCGT 1080 

GGGAGACCCC TTCGATGTCA AAACAGAACA GGGGCCTCAG ATTGATCAAA AGCAGTTCGA 1140 

CAAAATCTTA GAGCTGATCG AGAGTGGGAA GAAGGAAGGG GCCAAGCTGG AATGCGGGGG 1200 

CTCAGCCATG GAAGACAAGQ GGCTCTTCAT CAAACCCACT GTCTTCTCAG AAGTCACAGA 1260 

CAACATGCGG ATTGCCAAAG AGGAGATTTT CGGGCCAGTG CAACCAATAC TGAAGTTCAA 1320 

AAGTATCGAA GAAGTGATAA AAAGAGCGAA TAGCACCGAC TATGGACTCA CAGCAGCCGT 1380 

GTTCACAAAA AATCTCGACA AAGCCCTGAA GTTGGCTTCT GCCTTAGAGT CTGGAACGGT 1440 

CTGGATCAAC TGCTACAACG CCCTCTATGC ACAGGCTCCA TTTGGTGGCT TTAAAATGTC 1500 

AGGAAATGGC AGAGAACTAG GTGAATACGC TTTGGCCGAA TACACAGAAG TGAAAACTGT 1560 

CACCATCAAA CTTGGCGACA AGAACCC CTG AA GGAAAGGC GGGGCTCCTT CCTCAAACAT 1620 

CGGACGGCGG AATGTGGCAG ATGAAATGTG CTGGAGGAAA AAAATGACAT TTCTGACCTT 1680 

CCCGGGACAC ATTCTTCTGG AGGCTTTACA TCTACTGGAG TTGAATGATT GCTGTTTTCC 1740 

TCTCACTCTC CTGTTTATTC ACCAGACTGG GGATGCCTAT AGGTTGTCTG TGAAATCGCA 1800 

GTCCTGCCTG GGGAGGGAGC TGTTGGCCAT TTCTGTGTTT CCCTTTAAAC CAGATCCTGG 1860 

AGACAGTGAG ATACTCAGGG CGTTGTTAAC AGGGASTGGT ATTTGAAGTG TCCAGCAGTT 1920 

GCTTGAAATG CTTTGCCGAA TCTGACTCCA GTAAGAATGT GGGAAAACCC CCTGTGTGTT 1980 

CTGCAAGCAG GGCTCTTGCA CCAGCGGTCT CCTCAGGGTG GACCTGCTTA CAGAGCAAGC 2040 

CACGCCTCTT TCCGAGGTGA AGGTGGGACC ATTCCTTGGG AAAGGATTCA CAGTAAGGTT 2100 

TTTTGGTTTT TGTTTTTTGT TrrC TT G' m ' TTAAAAAAAG GATTTCACAG TGAGAAAGTT 2160 

TTGGTTAGTG CATACCGTGG AAGGGCGCCA GGGTCTTTGT GGATTGCATG TTGACATTGA 2220 

CCGTGAGATT CGGCTTCAAA CCAATACTGC CTTTGGAATA TSACAGAATC AATAGCCCAG 2280 

AGAGCTTAGT CAAAGACGAT ATCACGGTCT ACCTTAACCA AGGCACTTTC TTAAGCAGAA 2340 

AATATTGTTG AGGTTACCTT TGCTGCTAAA GATCCAATCT TCTAACGCCA CAACAGCATA 2400 

GCAAATCCTA GGATAATTCA CCTCCTCATT TGACAAATCA GAGCTGTAAT TCACTTTAAC 2460 

AAATTACGCA TTTCTATCAC GTTCACTAAC AGCTTATGAT AAGTCTGTGT AGTCTTCCTT 2520 

TTCTCCAGTT CTGTTACCCA ATTTAGATTA GTAAAGCGTA CACAACTGGA AAGACTGCTG 2580 

TAATAACACA GCCTTGTTAT TTTTAAGTCC TATTTTGATA TTAATTTCTG ATTAGTTAGT 2640 

AAATAACACC TGGATTCTAT GGAGGACCTC GGTCTTCATC CAAGTGGCCT GAGTATTTCA 2700 

CTGGCAGGTT GTGAATTTTT CTTTTCCTCT TTGGGAATCC AAATGATGAT GTGCAATTTC 2760 

ATGTTTTAAC TTGGGAAACT GAAAGTGTTC CCATATAGCT TCAAAAACAA AAACAAATGT 2820 

GTTATCCGAC GGATACTTTT ATGGTTACTA ACTAGTACTT TCCTAATTGG GAAAGTAGTG 2880 

CTTAAGTTTG CAAATTAAGT TGGGGAGGGC AATAATAAAA TGAGGGCCCG TAACAGAACC 2940 

AGTGTGTGTA TAACGAAAAC CATGTATAAA ATGGGCCTAT CACCCTTGTC AGAGATATAA 3000 

ATTACCACAT TTGGCTTCCC TTCATCAGCT AACACTTATC ACTTATACTA CCAATAACTT 3060 

GTTAAATCAG GATTTGGCTT CATACACTGA ATTTTCAGTA TTTTATCTCA AGTAGATATA 3120 

GACACTAACC TTGATAGTGA TACGTTAGAG GGTTCCTATT CTTCCATTGT ACGATAATGT 3180 

CTTTAATATG AAATGCTACA TTATTTATAA TTGGTAGAGT TATTGTATCT TTTTATAGTT 3240 

GTAAGTACAC AGAGGTGGTA TATTTAAACT TCTGTAATAT ACTGTATTTA GAAATGGAAA 3300 

TATATATAGT GTTAGGTTTC ACTTCTTTTA AGGTTTACCC CTGTGGTGTG GTTTAAAAAT 3360 

CTATAGGCCT GGGAATTCCG ATCCTAGCTG CAGATCGCAT CCCACAATGC GAGAATGATA 3420 
AAATAAAATT GGATATTTGA GA 

SEP ID NO:86 PDT1 PROTEIN SEQUENCE 

Protein Accession*: NPJJ00684 

1 11 21 31 41 51 

I I I I I I 

KATANGAVEN GQPDGKPPAL PRPIRNLEVK FTKIFHJNEW HESKSGKKFA TCNPSTREQI 60 

CEVEEGDKPD VDKAVEAAQV AFQRGSPWRR LDALSRGRLL HQLADLVERD RATLAALETM 120 

DTGKPFLHAF PIDLEGCIRT LRYFAGWADK IQGKTIPTDD NWCPTRHEP IGVCGAITPW 180 

NFPLLMLVWK LAPALCCGNT MVLKPAEGTP LTALYLGSLI KEAGFPPGW NTVPGFGPTV 240 

GAAISSHPQI NKIAFTGSTE VOKLVKEAAS RSNLKRVTLE LGGKNPCIVC ADADLDLAVE 300 
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CAHQGVFFNQ GQCCTAASRV FVEEQVYSEF VRRSVEYAKK RPVQDPFDVK TEQGPQIDQK 360 

QFDKILELIE S6KKEGAKLE C66SAHE0RO LFIKPTVFSE VTDNMRIAKE EIFGPVQPIL 420 

KFKSIEEVTK RANSTDYGLT AAVFTKNLDK ALKLASALES GTVWINCYNA LYAQAPFGGF 480 
KHSGNGRELG EYALAEYTEV KTVTIKLGDK NP 

SEQ ID N(fc87 P0V3 ONA SEQUENCE 

Nudefc Add Accession #: NM.032642 

Coding sequence: 1 84-1263 (underlined sequences correspond to start and stop cod oris) 

1 11 21 31 41 51 

I I I . I I I 

GACCATTAGC AGGCACCCAG GCCTGTCTTT GGCTCGGAAA CGGTGGCCCC CAATGTAGCC 60 

TAGTTTGAAC CTAGGAACTG CAGGACCAGA GAGATTCCAC TGGAGCCTGA TGGACGGGTG 120 

ACAGAGGGAA CCCTACTCTG GAAACTGTCA GTCCCAGGGC ACTGGGGAGG GCTGAGGCCG 180 

ACCATGCCCA GCCTGCTGCT GCTGTTCACG GCTGCTCTGC TGTCCAGCTG GGCTCAGCTT 240 

CTGACAGACG CCAACTCCTG GTGGTCATTA GCTTTGAACC CGGTGCAGAG ACCCGAGATG 300 

TTTATCATCG GTGCCCAGCC CGTGTGCAGT CAGCTTCCGG GGCTCTCCCC T6GCCAGAGG 360 

AAGCTGTGCC AATTGTACCA GGAGCACATG GCCTACATAG GGGAGGGAGC CAAGACTGGC 420 

ATCAAGGAAT GCCAGCACCA GTTCCGGCAG CGGCGGTGGA ATTGCAGCAC AGCGGACAAC 480 

GCATCTGTCT TTGGGAGAGT CATGCAGATA GGCAGCCGAG AGACCGCCTT CACCCACGCG 540 

GTGAGCGCCG OGGGCGTGGT CAACGCCATC AGCGGGGCCT GCCGCGAGGG CGAGCTCTCC 600 

ACCTGCGGCT GCAGCCGGAC GGCGCGGCCC AAGGACCTGC CCCGGGACTG GCTGTGGGGC 660 

GGCTGTGGGG ACAACGTGGA GTACGGCTAC CGCTTCGCCA AGGAGTTTGT GGATGCCCGG 720 

GAGCGAGAGA AGAACTTTGC CAAAGGATCA GAGGAGCAGG GCCGGGTGCT CATGAACCTG 780 

CAAAACAACG AGGCCGGTCG CAGGGCTGTG TATAAGATGG CAGACGTAGC CTGCAAATGC 840 

CACGGCGTCT CGGGGTCCTG CAGCCTCAAG ACCTGCTGGC TGCAGCTGGC CGAGTTCCGC 900 

AAGGTCGGGG ACCGGCTQAA GGAGAAGTAC GACAGCGCG6 CCGCCATGCG CGTCACCCGC 960 

AAGGGCOGGC TGGAGCTGGT CAACAGCCGC TTCACCCAGC CCACCCCGGA GGACCTGGTC 1020 

TATGTGGACC CCAGCCCCGA CTACTGCCTG CGCAACGAGA GCACGGGCTC CCTGGGCACG 1080 

CAGGGCCGCC TCTGCAACAA GACCTCGGAG GGCATCGATG GCTGTGAGCT CATGTGCTGC 1140 

GGGCGTGGCT ACAACCAGTT CAAGAGCGTG CAGGTGGAGC GCTGCCACTG CAAGTTCCAC 1200 

TGGTGCTGCT TCGTCAGGTG TAAGAAGTGC ACGGAGATCG TGGACCAGTA CATCTGTAAA 1260 

TAGCCCGGAG GGCCTGCTCC CGGCCCCCCC TGCACTCTGC CTCACAAAGG TCTATATTAT 1320 

ATAAATCTAT ATAAATCTAT TTTATATTTG TATAAGTAAA TGGGTGGGTG CTATACAATG 1380 

GAAAGATGAA AATGGAAAGG AAGAGCTTAT TTAAGAGACG CTGGAGATCT CTGAGGAGTG 1440 

GACTTTGCTG GTTCTCTCCT CTTGGTGGGT GGGAGACAGG GCTTTTTCTC TCCCTCTGGC 1500 

GAGGACTCTC AGGATGTAGG GACTTGGAAA TATTTACTGT CTGTCCACCA CGGCCTGGAG 1560 

GAGGGAGGTT GTGGTTGGAT GGAGGAGATG ATCTTGTCTG GAAGTCTAGA GTCTTTGTTG 1620 

GTTAGAGGAC TGCCTGTGAT CCTGGCCACT AGGCCAAGAG GCCCTATGAA GGTGGCGGGA 1680 

ACTCAGCTTC AACCTCGATG TCTTCAGGGT CTTGTCCAGA ATGTAGATGG GTTCCGTAAG 1740 

AGGCCTGGTG CTCTCTTACT CTTTCATCCA CGTGCACTTG TGCGGCATCT GCAGTTTACA 1800 

GGAACGGCTC CTTCCCTAAA ATGAGAAGTC CAAGGTCATC TCTGGCCCAG TGACCACAGA 1860 

GAGATCTGCA CCTCCCGGAC TTCAGGCCTG CCTTTCCAGC GAGAATTCTT CATCCTCCAC 1920 

GGTTCACTAG CTCCTACCTG AAGAGGAAAG GGGGCCATTT GACCTGACAT GTCAGGAAAG 1980 

CCCTAAACTG AATGTTTGCG CCTGGGCTGC AGAAGCCAGG GTGCATGACC AGGCTGCGTG 2040 

GACGTTATAC TGTCTTCCCC CACCCCOGGG GAGGGGAAGC TTGAGCTGCT GCTGTCACTC 2100 

CTCCACCGAG GGAGGCCTCA CAAACCACAG GACGCTGCAA CGGGTCAGGC TGGGGGGCCC 2160 

GGCGTGCTCA TCATCTCTGC CCCAGGTGTA CGQTTTCTCT CTGACATTAA ATGCCCTTCA 2220 
TGGAAAAAAA AAAAAGAAAA AAAAAAAAAA AA 

Protein AccSnT ^J^l 16031 

1 11 . 21 31 41 51 

I I I I I I 

HFSLIfLLFTA ALLSSWAQLL TDANSMWSLA LNPVQRPEMF IIGAQPVCSQ LPGLEPGQRK 60 

LCQLYQEHKA YIGEGAKTGI KECQHQFRQR KWNCSTADNA SVFGKVHQIG SRETAFTHAV 120 

SAAGWNAIS RACREGELST CGCSRTARPK DLPRDWLWGG CGDNVEYGYR FAKEFVDARE 180 

REKNFAKGSE EQGRVLHNLQ NNEAGRRAVY KMADVACKCH GVSGSCSLKT CWLQLAEFRK 240 

VGDRLKEKYD SAAAMRVTRK GRLELVNSRP TQPTPEDLVY VDPSPDYCLR NESTGSLGTQ 300 
GRLCNKTSEG KDGCELHOCG RGYNQFKSVQ VERCHCKFHW CCFVRCXKCT EIVDQYTCK* 

SEQ ID N&89 P DTD DNA SEQUENCE 

Wia^c Acid Accession*: NM.033280 

Cooing sequence: 58-636 (underlined sequences correspond to start and stop codons) 
1 11 21 31 41 51 

I I I I I I 

GGCAGCCGTC TGTGCCACCC AGAGCCGGCG GGCCGCTAGG TCCCCGGAGA CCCTGCTATG 60 

GTGCGTGCGG GCGCCGTGGG GGCTCATCTC CCCGCGTCCG GCTTGGATAT CTTCGGGGAC 120 

CTGAAGAAGA TGAACAAGCG CCAGCTCTAT TACCAGGTTT TAAACTTCGC CATGATCGTG 180 

TCTTCTGCAC TCATGATATG GAAAGGCTTG ATCGTGCTCA CAGGCAGTGA GAGCCCCATC 240 

GTGGTGGTGC TGAGTGGCAG TATGGAGCCG GCCTTTCACA GAGGAGACCT CCTGTTCCTC 300 

ACAAATTTCC GGGAAGACCC AATCAGAGCT GGTGAAATAG TTGTTTTTAA AGTTGAAGGA 360 

CGAGACATTC CAATAGTTCA CAGAGTAATC AAAGTTCATG AAAAAGATAA TGGAGACATC 420 

AAATTTCTGA CTAAAGGAGA TAATAATGAA GTTGATGATA GAGGCTTGTA CAAAGAAGGC 480 

CAGAACTGGC TGGAAAAGAA GGACGTGGTG GGAAGAGCAA GAGGGTTTTT ACCATATGTT 540 

GGTATGGTCA CCATAATAAT GAATGACTAT CCAAAATTCA AGTATGCTCT TTTGGCTGTA 600 

ATGGGTGCAT ATGTGTTACT AAAACGTGAA TCCTAAAATG AGAAGCAGTT CCTGGGACCA 660 

GATTGAAATG AATTCTGTTG AAAAAGAGAA AAACTAATAT ATTTGAGATG TTCCATTTTC 720 
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5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



TGTATAAAAG GGAACAGTGT GGAGATGTTT TTGTCTTGTC CAAATAAAAG ATTCACCAGT 780 
AAAAAAAAAA AAAA 

SEQ D3 N0:9Q Pmp Protein sequence 
Protein Accession*. NP_150596 

1 11 21 31 41 51 

I I I I I -I 

MVRAGAVGAH LPASGLDIFG DLKKHNKRQL YYQVLNFAMI VSSALMIWXG LIVLTGSESP 60 
IWVLSGSME PAFHRGDLLF LTNFREDPIR AGEIWFKVE GRDIPIVHRV IKVHEKDNGD 120 
IKFLTKGDNN EVDDRGLYKE GQNWLEKKDV VGRARGFLPY VGMVTIIMND YPKFKYALLA 180 
VMGAYVLLKR ES 

SEQtDNO:91 PDV5DNA SEQUENCE 

Nucleic Acid Accession!: NML.016590 

Coding sequence: 691-975 (underlined sequences correspond to start and stop codons) 



l 

I 

GATTACTCAC 
CGTGTCAGAA 
TAC6CCAACT 
0CAGAG0CA6 
AGGGCTGCAC 
CACTTTGCCT 
GGTGATCTGG 
CAGCTGTGCC 
CAAACGCCTG 
CAGGTTGTTA 
TGGAAAACAA 
AAAATTTCAT 
AGCCAACAAT 
TTTCCCATAG 
6TCACCA6CA 
TGTGTCAATA 
AAAAGTTTTG 
GAACACATTC 
ATAATAATCA 
ATCTTTAOTC 
GATGTCCCAT 
AG CCTGTGCA 
ATCTTCTTTT 
CTCAATTCAT 
CGATCAATTA 
AAG6GTCACC 
AAACAAGAAT 
TGAACCAAGC 
TCTCAAAATC 



CTGTTAGTAT 
GATAAAGAGC 
GAACCAGTTT 
CTCATAGTGT 
CTCATCTTAA 



11 
I 

ACAGTCTTGA 
CTCAATTACO 
ATAGGACTCO 
CCAATCACTT 
TGGAACAACA 
CTAAAGGCCA 
GGAAAACGCA 
GGCGCTACGG 
AGTGCTGCTG 
GAATTGTTAC 
6CCCTTTATT 
TAAAATCCCC 
ACTCTAAACT 
GAAGACTTCA 
ACCATCCGCA 
TACAACTGAG 
TTTGACTCAA 
ATGATGAGAA 
TGATAACCTG 
CTTGGAGCTG 
TATTATCCAC 
AAACAAAGCA 
GCCTAAAATT 
CAGGACTTGT 
ATGTTTTCTG 
CAAATAGCTG 
TAAGATGATC 
ACTGTCAGCA 
TGGGCCAAGA 
GCCAAAGAAA 
GTCAAATCAA 
CCCATTATTT 
CCTGGTAGGG 
AAACTGAAAG 
CAAAGCAAAA 



21 
I 

AGATGCAATG 
ACTACATATG 
TGCTTCTCGT 
AGCTCCTCAT 
CAGATGAGAT 
GAGAAAAATC 
GCTACACCTG 
GACCCGAGCC 
CCTTOGGTGA 
COCCTTTACT 
GAATTTTCAA 
TTGAACTCCC 
GAGGCCTGCA 
CCTCCTACAA 
GTCATTCAAG 
TTACAGACTG 
CTTCAAGCTG 
CTTTCTAAAA 
AAACATGTTA 
TCACATAGCA 
CCTGAGCCAC 
ATGGAAAAGG 
ACTAATGCAC 
ATTAGCAGGT 
GTGATCACAT 
AGTGCAGTCC 
CCAATAAAAG 
AATCTCAGGT 
ATGATTGCTA 
GTCACTCATG 
CTAAGACTGG 
TCACAGTGCC 
AACTGCTGAC 
AAAAATAGTT 
AAAAATGCTT 



31 
I 

TCAGCTATTT 
CATTAAGGCA 
ACGCTGGGCT 
AACAAGTCTA 
ATTCTACACA 
ACAGCTTCCT 
GAGCAAGGTC 
GTCCCAGAAA 
CTATATGAGA 
CAGAGATAAC 
CACAGACTCC 
ATG TTCAAAT 
AGTCATTTCA 
CTCCGAAGAA 
TGGAAGCTTT 
TCCCCTGGCT 
CTCATCTGTT 
GACCAGCACT 
CTGGGACTCG 
GGGGCAACCT 
CATAATATGC 
AAACTAAAAA 
CACGTCAGTC 
TCTGGCTAGA 
CAGGCCCTAT 
TTGCTCATAT 
AAAAATTGCT 
ATTAGAGCAA 
GGTCCATAAG 
AGTAAACTAT 
CAGGGTATTA 
AGCCTCTACC 
AGTTTCAATG 
GCTTTTTAAA 
TAATTCAAAT 



41 

I 

AGGACAGAAA 
GGAACTGGCA 
ATAATCTATG 
ACTGGCTCTG 
TTAATCTACT 
TGTCGGAGGG 
TCTTCCCGGC 
CCAAAGGGCA 
ATGGAAACTT 
ATAGATTATC 
CTGCTTCTCA 
CTCCATTTGT 
TTTGTATTTT 
AACCCTTACT 
CACAGCTTTT 
CCCTGACCCT 
AGTAAGTGAT 
GC TCTTCC CC 
ACATTTTTCT 
CACACTGAAA 
TGTTTACATT 
ATATACATAC 




51 

I 

CATCCAAGGC 
GGCCTCAGGG 
AAACTGAGCT 
6AAAGCTGAA 
TATCTGGAAT 
GAAAAGGACA 
TTGGCAATCT 
GGCACGGCAG 
CTAAGGAAGC 
CAGGCTGAGA 
TCTCCTTAAT 
TGACAGACAA 
TGTCCAGAAA 
GTCCAAGACC 
GTACATTCTC 
TACAAACACT 
GTTCACTCCA 
TCCTATAATC 
GGGGATTGAA 
CAAAGGAAGT 
TATTTTCTTC 
TAGTACCATT 
AGGCATCATT 
CCTGTCATCA 
CATGGTATAC 
TTAACCCCGC 
AACCTTTTTC 
TTGAAAAGTG 
TGGCCTTGCC 
CAGACCCATC 
AGGTGACATG 
CTAGACCTTG 
GAGCCAATGC 
GAAGGCCTGC 
GATACTAAAA 



€0 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 



51 



SMQNI^P^pmteinsewro 
Protein Accession * NPJB7674 

1 11 21 31 41 

I I I I I I * 

MQCQLFRTET SKAVSELNYD YICIKAGTGR PQGTPTIGLV LLVRWAIIYE TELQSQPIT 

SEQ (D N0^3 PEE6 DNA SEQUENCE 

Nucleic Acid Accession t: NM.002606 

Coding sequence: 61-1 842 (underlined sequences correspond lo start and stop codorts) 



1 
I 

CGCGGCGGCT 
ATGGGATCCG 
ATTCAGAAGG 
ATCGCCACCG 
GTCTCCATCG 
GTGGCCATCA 
TCTGCTGAGA 
GGAGCATTTG 
GAAGGCCAGC 
GCAGAGCAGT 
TTGGCTGTCC 



11 

I 

GGCGTCGGGA 
GCTCCTCCAG 
TAATCTTCAG 
GCCTGCCTCG 
ACCCCACCAT 
AGCAACTCTC 
GACCACTGAG 
AAAGTGGACA 
GCATCCCTCC 
TCTCAAGAGC 
TAGAGAAACG 



21 
I 

AAGTACAGTA 
CTACCGGCCC 
CAAGTACTGC 
GAACACGACC 
GCCCGCGAAT 
CGCTGGTGTC 
GGACAGACGG 
GGTAGAGCCC 
AGAGAGAGAA 
ATTCAAAATC 
CGTGGAATTG 



31 
I 

AAAAGTCCGA 
AAGGCCATCT 
AACTCCAGCG 
ATCTCCCTGC 
TCAGAACGCA 
GAGGACAAGA 
GTTGTGGGCC 
AGGCCCAGAG 
GAATTAATCC 
AATGAACTGA 
GAAGGACTAA 



41 

I 

GTGCAGCCGC 
ACCTGGACAT 
ACATCATGGA 
TGACCACCGA 
CTCCGTACAA 
GAACCACAAG 
TGGAGCAGCC 
AGCCCCAGGG 
AGAGCGTGCT 
AAGCTGAAGT 
AAGTGGTGGA 



51 
I 

CGGGCGCAGG 
CGATGGACGC 
CCTGTTCTGC 
CGACGCCATG 
AGTGAGACCT 
CCGTGGCCAG 
CCGGAGGGAA 
CTGCTACCAG 
GGCGCAGGTT 
TGCAAATCAC 
GATTGAGAAA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
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TGCAAGAGTG ACATTAAGAA GATGAGGGAG GAGCTGGCGG CCAGAAGCAG CAGGACCAAC 720 

TGCCCCTGTA AGTACAGTTT TTTGGATAAC CACAAGAAGT TGACTCCTCG ACGCGATGTT 780 

CCCACTTACC CCAAGTACCT GCTCTCTCCA GAGACCATCG AGGCCCTGCG GAAGCCGACC 840 

TTTGACGTCT GGCTTTGGGA GCCCAATGAG ATGCTGAGCT GCCTGGAGCA CATGTACCAC 900 

GACCTCGGGC TGGTCAGGGA CTTCAGCATC AACCCTGTCA CCCTCAGGAG GTGGCTGTTC 960 

TGTGTCCACG ACAACTACAG AAACAACCCC TTCCACAACT TCCGGCACTG CTTCTGCOTG 1020 

GCCCAGATGA TGTACAGCAT GGTCTGGCTC TGCAGTCTCC AGGAGAAGTT CTCACAAACG 1080 

GATATCCTGA TCCTAATGAC AGCGGCCATC TGCCACGATC TGGACCATCC CGGCTACAAC 1140 

AACACGTACC AGATCAATGC CCGCACAGAG CTGGCGGTCC GCTACAATGA CATCTCACCG 1200 

CTGGAGAACC ACCACT6CGC CGTGGCCTTC CAGATCCTCG CCGAGCCTGA GTGCAACATC 1260 

TTCTCCAACA TCCCACCTGA TGGGTTCAAG CAGATCCGAC AGGGAATGAT CACATTAATC 1320 

TTGGCCACTG ACATGGCAAG ACATGCAGAA ATTATGGATT CTTTCAAAGA GAAAATGGAG 1380 

AATTTTGACT ACAGCAACGA GGAGCACATG ACCCTGCTGA AGATGATTTT GATAAAATGC 1440 

TGTGATATCT CTAACGAGGT CCGTCCAATG GAAGTCGCAG AGCCTTGGGT GGACTGTTTA 1500 

TTAGAGGAAT ATTTTATGCA OAGCGACCGT GAGAAGTCAG AAGGCCTTCC TGTGGCACOG 1S60 

TTCATGGACC GAGACAAAGT GACCAAGGCC ACAGCCCAGA TTGGGTTCAT CAAGTTTGTC 1620 

CTGATCCCAA TGTTTGAAAC AGTGACCAAG CTCTTCCCCA TGGTTGAGGA GATCATGCTG 1680 

CAGCCACTTT GGGAATCCCG AGATCGCTAC GAGGAGCTGA AGCGGATAGA TGACGCCATG 1740 

AAAGAGTTAC AGAAGAAGAC TGACAGCTTG ACGTCTGGGG CCACCGAGAA GTCCAGAGAG 1800 

AGAAGCAGAG ATGTGAAAAA CAGTGAAGGA GACTGTGC CT GAG GAAAGCG • GGGGGCGTGG 1860 

CTGCAGTTCT GGACGGGCTG GCCGAGCTGC GCGGGATCCT TGTGCAGGGA AGAGCTGCCC 1920 

TGGGCACCTG GCACCACAAG ACCATGTTTT CTAAGAACCA TTTTGTTCAC TGATACAAAA 1980 
AAAAAAAAAA A 



SEOPNO:94P_ 

Protein Accession t: NP_002597 

I 11 21 • 31 41 51 

II 1 I I I 

HGSGSSSYRP KAIYLDHX3R IQKVIFSKYC KSSDIMDLFC IATGLPRNTT ISLLTTDDAM 60 

VSIDPTMPAN SERTPYKVRP VAIKQLSAGV EDKRTTSRGQ SAERPLRBRR WGLEQPRRE 120 

GAFESGQVEP RPREPQGCYQ EGQRIPPERE ELIQSVLAQV AEQFSRAFKI NELKAEVANH 180 

LAVLEKHVEL EGLKWEIEK CKSDXKKMRE ELAARSSRTN CPCKYSFLDN HKKLTPRRDV 240 

PTYPKYLLSP ETIEALRKPT FBVWLWEPNE MLSCLEHMYH DLGLVRDFSI NPVTLRRWLF 300 

CVHDNYRNNP FHNFRHCFCV AQMKYSMVWL CSLQEKFSQT DILILMTAAI CHDLDHPGYN 360 

NTYQINARTE LAVRYNDISP LENHHCAVAF QILAEPECNI FSNIPPDGFK QIRQGfcHTLI 420 

LATDMARHAE IKDSFKEKME NFDYSNEEHM TLLKMILIKC CDISNEVRPM EVAEPWVDCL 480 

LEEYFMQSDR EKSEGLPVAP FMDRDKVTKA TAQIGFIKFV LIPMFETVTK LFPMVEEIML 540 
QPLWESRDRY EELKRIDDAM KELQKKTDSL TSGATEK5RE RSRDVKNSEG DCA 

SEQ 10 N£h95 PEG4 DNA SEQUENCE 

Nucleic Add Accession ft nooe 

Coding sequence: 41-559 (undefined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

CAGTCACAGG CGAGAGCCYT GGGATGCACC GGCCAGAGGC ATGCTGCTGC TGCTCACGCT 60 

TGCCCTCCTG GGGGGCCCCA CCTGGGCAGG GAAGATGTAT GGCCCTGGAG GAGGCAAGTA 120 

TTTCAGCACC ACTGAAGACT ACGACCATGA AATCACAGGG CTGCGGGTGT CTGTAGGTCT 180 

TCTCCTGGTG AAAAGTGTCC AGGTGAAACT TGGAGACTCC TGGGACGTGA AACTGGGAGC 240 

CTTAGGTGGG AATACCCAGG AAGTCACCCT GCAGCCAGGC GAATACATCA CAAAAGTCTT 300 

TGTOGOCTTC CAAGCTTTCC TCCGGGGTAT GGTCATGTAC ACCAGCAAGG ACCGCTATTT 360 

CTATTTTGGG AAGCTTGATG GCCAGATCTC CTCTGCCTAC CCCAGCCAAG AGGGGCAGGT 420 

GCTGGTGGGC ATCTATGGCC AGTA3CAACT CCTTGGCATC AAGAGCATTG GCTTTGAATG 480 

GAATTATCCA CTAGAGGAGC CGACCACTGA GCCACCAGTT AATCTCACAT ACTCAGCAAA 540 

CTCACCCGTG GGTCGCTAGG GTGGGGTATG GGGCCATCCG AGCTGAGGCC ATCTGTGTGG 600 

TGGTGGCTGA TGGTACTGGA GTAACTGAGT CGGGACGCTG AATCTGAATC CACCAATAAA 660 
TAAAGCTTCT GCAGAATCAG TGAAAAAAAA A 



SEP IQ NOas PEG4 Protein sequence 
Protein Accession ft FGENESH predicted 

I 11 21 31 41 51 

II III! 

MLLLLTLALL GGPTWAGKMY GPGGGKYFST TEDYDHEITG LRVSVGLLLV KSVQVKLGDS 60 
WDVKLGALGG NTQEVTLQPG EYITKVFVAF QAFLRGMVMY TSKDRYFYFG KLDGQISSAY 120 
PSQEGQVLVG IYGQYQLLGI KSIGFEKNYP LEEPTTEPPV NLTYSANSPV GR 



SEQ ID NO: 97 PEL9 DNA SEQUENCE 

Nucleic Add Accession*: NMJW6953 

Coding sequence: 33-69 6{unterl]rted sequences correspond to start and stop codons) 

1 11 21 31 41 51 

) I I I I I 

CCGTTCCGCG CTCTGGCGGC TCCTCCCGGG CGATGOCTCC GCTCTGGGCC CTGCTGGCCC 60 

TCGGCTGCCT GCGGTTCGGC TOGGCTGTGA ACCTGCAGCC CCAACTGGCC AGTGTGACTT 120 

TCGCCACCAA CAACCCCACA CTTACCACTG TGGCCTTGGA AAAGCCTCTC TGCATGTTTG 180 

ACAGCAAAGA GGCCCTCACT GGCACCCACG AGGTCTACCT GTATOTCCTG GTCGACTCAG 240 

CCATTTCCAG GAATGCCTCA GTGCAAGACA GCACCAACAC CCCACTGGGC TCAACGTTCC 300 
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TACAAACAGA GGGTGGGAGG ACAGGTCCCT 
GCAGTGACCT GCCCAGCCTG 6ATGCCATTG 
ATGCCTACCT GGTCAGGGTG GGTGCCAACG 
GCCTCTGTAA CGCACCCCTG TCGGCAGCCA 
ATATGTCCAC GGGCTTGGTA GAGGACCAGA 
AGCTCACCCC ATACTCGAOG ATCGACACGT 
TCATCACTTC CATCCTGGGC TCCCTGCCCT 
TTGCCCTCAG CCTCGTGGAC ATGGGGAGTT 
TCACTCAGGA GGCTGTTOCC AAGTCGCTGG 
ACCGGGGGCC GCCACTGGAC AGGGCTGAGG 
AGCACCACCC CTGGGCAGCA GCATCCTCCT 
GGTTGTCACA OCCTGACTTC AGGGAAGGTG 
AACCCTTAAT AAAATCTTCT GATGAGTTCT 

SEQ ID HQM PEU Protein sequence 
Protein Accession ft NP.008884 



ACAAAGCTGT GGCCTTTGAC CTGATCCCCT 360 

GGGATGTGTC CAAGGCCTCA CAGATCCTGA 420 

GGACCTGCCT GTGGGATCCC AACTTOCAGG 480 

CGGAGTACAG GTTCAAGTAT GTCCTGGTCA 540 

CCCTGTGGTC GGACCCCATC CGCACCAACC 600 

GGCCAGGCCG GCGGAGCGGA GGCATGATCG 660 

TCTTTCTACT TGTGGGTTTT GCTGGCGCCA 720 

CTGATGGGGA AACGACTCAC GACTCCCAAA 780 

GGGCCTCGGA GTCTTCCTAC ACGTCCGTGA 840 

TGTATTCCAG CAAGCTCCAA GACTGAGCCC 900 

CTCTGGCCTT GCCCCAGGCC CTGCAGCGGT 960 

AAACAGGGCT TGTCCCTCCA ACTGCAGGAA 1020 
AAAAAAAAA 



1 11 21 31 41 51 

1111,1 

KPPLHALLAL GCLRFGSAVN LQPQLASVTF ATNNPTLTTV ALEKPLCMFD SKBALTGTHE 60 

VYLYVLVDSA ISRNA5VQDS TNTPLGSTFL QTEGGRTGPY KAVAFDLIPC SDLPSLDAIG 120 

DVSKASQILN AYLVKVGANG TCLWDPNFQG LCNAPLSAAT EYRFKYVLVN MSTGLVEDQT 180 

LWSDPIRTNQ LTPYSTIDTW PGRRSGGMTV ITSILGSLPF FLLVGFAGAI ALSLVDMGSS 240 
DGETTHDSQI TQEAVPKSLG ASESSYTSVN RGPPLDRAEV YSSKLQD 

SEQ ID NCh99 PEN1 DNA SEQUENCE 

Nucleic Acid Accession* NM_012391 

Coding sequence: 416*1423 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

1 I 1 1 1 1 

GTCTGACTTC CTCCCAGCAC ATTCCTGCAC TCTGCCGTGT CCACACTGCC CCACAGACCC 60 

_AGTCCTCCAA GCCTGCTGCC AGCTCCCTGC AAGCCCCTCA GGTTGGGCCT TGCCACGGTG 120 

CCAGCAGGCA GCCCTGGGCT GGGGGTAGGG GACTCCCTAC AGGCACGCAG CCCTGAGACC 180 

TCAGAGGGCC ACCCCTTGAG GGTGGCCAGG CCCCCAGTGG CCAACCTGAG TGCTGCCTCT 240 

GCCACCAGCC CTGCTGGCCC CTGGTTCCGC TGGCCCCCCA GATGCCTGGC TGAGACACGC 300 

CAGTGGCCTC AGCTGCCCAC ACCTCTTCCC GGCCCCTGAA GTTGGCACTG CAGCAGACAG 360 

CTCCCTGGGC ACCAGGCAGC TAACAGACAC AGCCGCCAGC CCAAACAGCA GCGGCATGGG 420 

CAGCGCCAGC CCGGGTCTGA GCAGCGTATC CCCCAGCCAC CTCCTGCTGC CCCCCGACAC 480 

GGTGTCGCGG ACAGGCTTGG AGAAGGCGGC AGCGGGGGCA GTGGGTCTCG AGAGACGGGA 540 

CTGGAGTCCC AGTCCACCCG CCACGCCCGA GCAGGGCCTG TCCGCCTTCT AOCTCTCCTA 600 

CTTTGACATG CTGTACCCTG AGGACAGCAG CTGGGCAGCC AAGGCCCCTG GGGCCAGCAG 660 

TCGGGAGGAG CCACCTGAGG AGCCTGAGCA GTGCCCGGTC ATTGACAGCC AAGCCCCAGC 720 

GGGCAGCCTG GACTTGGTGC CCGGCGGGCT GACCTTGGAG GAGCACTCGC TGGAGCAGGT 780 

GCAGTCCATG GTGGTGGGCG AAGTGCTCAA GGACATCGAG ACGGCCTGCA AGCTGCTCAA 840 

CATCACCGCA GATCCCATGG ACTGGAGOCC CAGCAATGTG CAGAAGTGGC TCCTGTGGAC 900 

AGAGCACCAA TACCGGCTGC CCCCCATGGG CAAGGCCTTC CAGGAGCTGG CGGGCAAGGA 960 

GCTGTGCGCC ATGTCGGAGG AGCAGTTCCG CCAGCGCTCG CCCCTGGGTG GGGATGTGCT 1020 

GCACGCCCAC CTGGACATCT GGAAGTCAGC GGCCTGGATG AAAGAGCGGA CTTCACCTGG 1080 

GGCGATTCAC TACTGTGCCT CGACCAGTGA GGAGAGCTGG ACOGACAGCG AGGTGGACTC 1140 

ATCATGCTCC GGGCAGCCCA TCCACCTGTG GCAGTTCCTC AAGGAGTTGC TACTCAAGCC 1200 

CCACAGCTAT GGOCGCTTCA TTAGGTGGCT CAACAAGGAG AAGGGCATCT TCAAAATTGA 1260 

GGACTCAGCC CAGGTGGCCC GGCTGTGGGG CATCCGCAAG AACCGTCCCG CCATGAACTA 1320 

CGACAAGCTG AGCCGCTCCA TCCGCCAGTA TTACAAGAAG GGCATCATCC GGAAGCCAGA 1380 

CATCTCCCAG OGCCTCGTCT ACCAGTTCGT GCACCCCATC TGAGTGCCTG GCCCAGGGCC 1440 

TGAAACCCGC CCTCAGGGGC CTCTCTCCTG CCTGCCCTGC CTCAGCCAGG CCCTGAGATG 1500 

GGGGAAAACG GGCAGTCTGC TCTGCTGCTC TGACCTTCCA GAGCCCAAGG TCAGGGAGGG 1560 

GCAACCAACT GCCCCAGGGG GATATGGGTC CTCTGGGGCC TTCGGGACCA TGGGGCAGGG 1620 

GTGCTTCCTC CTCAGGCCCA GCTGCTCCCC TGGAGGACAG AGGGAGACAG GGCTGCTOCC 1680 

CAACACCTGC CTCTGACCCC AGCATTTCCA GAGCAGAGCC TACAGAAGGG CAGTGACTCG 1740 

ACAAAGGCCA CAGGCAGTCC AGGCCTCTCT CTGCTCCATC CCCCTGCCTC CCATTCTGCA 1800 

CCACACCTGG CATGGTGCAG GGAGACATCT GCACCCCTGA GTTGGGCAGC CAGGAGTGCC 1860 

CCCGGGAATG GATAATAAAG ATACTAGAGA ACTG 



seq id Mftioo PEH1 Protein sequence 
Protein Accession #: NPJ)36523 

1 11 21 31 41 51 

I I I I I I 

MGSASPGLSS VSPSHLLLPP DTVSRTGLEK AAAGAVGLER RDWSPSPPAT PEQGLSAPYL 60 
SYFDMLYPED SSWAAKAPGA SSREEPPEEP EQCPVIDSQA PAGSLDLVPG GLTLEEHSLE 120 
QVQSMWGEV LKDIETACKL UJTEADPMDW SPSNVQKWLL WTEHQYRLPP KGKAFQELAG 180 
KELCAMSEEQ FRQRSPLGGD VLHAHLDIWK SAAWMKERTS PGAIHYCAST SEESWTDSEV 240 
DSSCSGQPIH LWQFLKELLL KPHSYGRFIR WLNKEKGIFK IEDSAQVARL WGIRKNRPAK 300 
NYDKLSRSIR QYYKKGIIRK PDISQRLVYQ FVHPI 

SEQ ID KO:101 PEN3 DNA SEQUENCE 

Nudete Acid Accession #: NM.000742 

Codins sequence: 55S2144 {underlined sequences correspond to start and stop codons) 
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1 11 21 31 41 51 

I I I I I I 

GAGAGAACAG CGTGAGCCTG TGTOCTTGTQ TGCTGAGCCC TCATCCCCTC CTGGGGCCAG 60 

GCTTGGGTTT CACCTGCAGA ATCGCTTGTG CTGGGCTCCC TGGGCTGTCC TCAGTGGCAC 120 

CTGCATGAAG CCGTTCTGGC TGCCAGAGCT GGACAGCCCC AGGAAAACCC ACCTCTCTGC 180 

AGAGCTTGCC CAGCTGTC CC CGGGAAGCCA AATGCCTCTC ATGTAAGTCT TCTGCTCGAC 240 

GGGGTGTCTC CTAAACCCTC ACTCTTCAGC CTCTGTTTGA CCATGAAATG AAGTGACTGA 300 

GCTCTATTCT GTACCTGCCA CTCTATTTCT GGGGTGACTT TTGTCAGCTG CCCAGAATCT 360 

CCAAGCCAGG CTGGTTCTCT GCATCCTTTC AATGACCTGT TTTCTTCTGT AACCACAGGT 420 

TCGGTGGTGA GAGGAAGCCT CGCAGAATCC AGCAGAATCC TCACAGAATC CAGCAGCAGC 480 

TCTGCTGGGG ACATGGTCCA TGGTGCAACC CACAGCAAAG CCCTGACCTG ACCTCCTGAT 540 

GCTCAGGAGA AGCCATGGGC CCCTCCTGTC CTGTGTTCCT GTCCTTCACA AAGCTCAGCC 600 

TGTGGTGGCT CCTTCTGACC CCAGCAGGTG GAGAGGAAGC TAAGCGCCCA CCTCCCAGGG 660 

CTCCTGGAGA CCCACTCTCC TCTCCCAGTC CCACGGCATT GCC6CAGGGA GGCTCGCATA 720 

CCGAGACTGA GGACCGGCTC TTCAAACACC TCTTCCGGGG CTACAACCQC TGGGCGCGCC 780 

CGGTGCCCAA CACTTCAGAC GTGGTGATTG TGCGCTTTGG ACTGTCCATC GCTCAGCTCA 840 

TCGATGTGGA TGAGAAGAAC CAAATGATGA CCACCAACGT CTGGCTAAAA CAGGAGTGGA 900 

GCGACT ACAA ACTGCGCTGG AACCOCGCTG ATTTTGGCAA CATCACATCT CTCAGGGTCC 960 

CTTCTGAGAT GATCTGGATC CCCGACATTG TTCTCTACAA CAATGCAGAT GGGGAGTTTG 1020 

CAGTGACCCA CATGACCAAG GCCCACCTCT TCTCCACGGG CACTGTGCAC TGGGTGCCCC 1080 

CGGCCATCTA CAAGAGCTCC TGCAGCATCG ACGTCACCTT CTTCCCCTTC GACCAGCAGA 1140 

ACTGCAAGAT GAAGTTTGGC TCCTGGACTT ATGACAAGGC CAAGATCGAC CTGGAGCAGA 1200 

TGGAGCAGAC TGTGGACCTG AAGGACTACT QGGAGAGCGG CGAGTGGGCC ATCGTCAATG 1260 

CCACGGGCAC CTACAACAGC AAGAAGTACG ACTGCTGCGC CGAGATCTAC CCCGACGTCA 1320 

CCTACGCCTT OOTCATCCGG CGGCTGCCGC TCTTCTACAC CATCAACCTC ATCATCCCCT 1380 

GCCTGCTCAT CTCCTGCCTC ACTGTGCTGG TCTTCTACCT GCCCTCCGAC TGCGGCGAGA 1440 

AGATCACGCT GTGCATTTOG GTGCTGCTGT CACTCACCGT CTTCCTGCTG CTCATCACTG 1500 

AGATCATCCC GTCCACCTCG CTGGTCATCC CGCTCATCGG CGAGTACCTG CTGTTCACCA 1560 

TGATCTTCGT CACCCTGTCC ATCGTCATCA CCGTCTTCGT GCTCAATGTG CACCACCGCT 1620 

CCCCCAGCAC CCACACCATG CCCCACTGGG TGCGGGGGGC CCTTCTGGGC TGTGTGCCCC 1680 

GGTGGCTTCT GATGAACCGG CCCCCACCAC CCGTGGAGCT CTGCCACCCC CTACGCCTGA 1740 

AGCTCAGCCC CTCTTATCAC TGGCTGGAGA GCAACGTGGA TGCCGAGGAG AGGGAGGTGG 1800 

TGGTGGAGGA GGAGGACAGA TGGGCATGTG CAGGTCATGT GGCCCCCTCT GTGGGCACCC 1860 

TCTGCAGCCA CGGCCACCTG CACTCTGGGG CCTCAGGTCC CAAGGCTGAG GCTCTGCTGC 1920 

AGGAGGGTGA GCTGCTGCTA TCACCCCACA TGCAGAAGGC ACTGGAAGGT GTGCACTACA 1980 

TTGCCGACCA CCTGCGGTCT GAGGATGCTG ACTCTTCGGT GAAGGAGGAC TGGAAGTATG 2040 

TTGCCATGGT CATCGACAGG ATCTTCCTCT GGCTGTTTAT CATCGTCTGC TTCCTGGGGA 2100 

CCAICGGCCT CTTTCTGCCT CCGTTCCTAG CTGGAATGAT CTGACTGCAC CTCCCTCGAG 2160 

CTGGCTCCCA GGGCAAAGGG GAGGGTTCTT GGATGTGGAA GGGCTTTGAA CAATGTTTAG 2220 

ATTTGGAGAT GAGCCCAAAG TGCCAGGOAG AACAGCCAGG TGAGGTGGGA GGTTGGAGAG 2280 

CCAGGTGAGG TCTCTCTAAG TCAGGCTGGG GTTGAAGTIT GGAGTCTGTC CGAGTTTGCA 2340 

GGGTGCTGAG CTGTATGGTC CAGCAGGGGA GTAATAAGGG CTCTTCOGGA AGGGGAGGAA 2400 

GCGGGAGGCA GGCCTGCACC TGATGTGGAG GTACAGGCAG ATCTTCCCTA CCGGGGAGGG 2460 

ATGGATGGTT GGATACAGGT GGCTGGGCTA TTCCATCCAT CTGGAAGCAC ATTTGAGCCT 2520 

CCAGGCTTCT CCTTGACGTC ATTCCTCTCC TT 0C T TCCTG CAAAATGGCT CTGCACCAGC 2580 

CGGCCCCCAG GAGGTCTGGC AGAGCTGAGA GCCATGGCCT GCAGGGGCTC CATATGTCCC 2640 
TACGCGTGCA GCAGGCAAAC AAGA 

S6Q rp W:J02 PEN3 Pfft^n SWVVW 
Protein Accession*: NPJD00733 

1 11 21 31 41 51 

I I I I I I 

MGPSCFVFLS FTKLSLWWLL LTPAGGEEAK RPPPRAPGDP LSSPSPTALP QGGSHTETED 60 

RLPKHLFRGY NRWARPVPNT SDWIVRFGL SIAQLIDVDE KNQMMTTNVW LKQEWSDYKL 120 

RWKPADPGNI TSLRVPSEMI WIPDIVLYNN ADGEFAVTHM TKAHLFSTGT VHWVPPAIYK 180 

SSCSIDVTFF PPDQONCKMK FGSWIYDKAK IDLEQMEQTV ELKDYWESGE WAIVNATGTY 240 

NSKKYDCCAE IYPDVTXAFV IRRLPLFYTI NLIIPCLLIS CLTVLVPYLP SDCGEKITLC 300 

ISVLLSLTVF LLLITEIIPS TSLVIPLIGE YLLFTMIFVT LSIVITVFVL NVHHRSPSTH 360 

TMPHWVRGAL LGCVFRWLLH NRPPPPVELC HPLRLKLSPS YHWLESNVDA EEREWVEEE 420 

DRWACAGHVA PSVGTLCSHG ELH5GASGPK AEALLQEGEL LLSPHMQKAL EGVHYIADHL 480 
RSEDADSS VK EDWKYVAMVI DRIFLWLFQ VCFLGTIGLF LPPFLAGM1 

SEQ 10 K0:103 PEU4 DNA SEQUENCE 

Nucleic Acid Accession*: NM_0 18670 

Coding sequence: 87-893 (underilned sequences ccfiespond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

CACGAGGCTG GAAGGGGCCA CTTCACACCT CGGGCTCGGC ATAAAGCGGC CGCCGGCCGC 60 

CGGCCCCCAG ACGCGCCGCC GCTGCCATGG CCCAGCCCCT GtGCCCGCCG CTCTCCGAGT 120 

CCTGGATGCT CTCTGCGGCC TGGGGCCCAA CTCGGCGGCC GCCGCCCTCC GACAAGGACT 180 

GCGGCCGCTC CCTCGTCTCG TCCCCAGACT CATGGGGCAG CACCCCAGCC GACAGCCCCG 240 

TGGCGAGCCC CGCGCGGCCA GGCACCCTCC GGGACCCCCG CGCCCCCTCC GTAGGTAGGC 300 

GCGGCGCGCG CAGCAGCCGC CTGGGCAGCG GGCAGAGGCA GAGCGCCAGT GAGCGGGAGA 360 

AACTGCGCAT GCGCACGCTG GCCCGCGCCC TGCACGAGCT GCGCCGCTTT CTACCGCCGT 420 

CCGTGGCGCC CGCGGGCCAG AGCCTGACCA AGATCGAGAC GCTGCGCCTG GCTATCCGCT 480 

ATATCGGCCA CCTGTCGGCC GTGCTAGGCC TCAGCGAGGA GAGTCTCCAG CGCCGGTGCC 540 

GGCAGCGCGG TGACGCGGGG TCCCCTCGGG GCTGCCCGCT GTGCCCCGAC GACTGCCCCG 600 

CGCAGATGCA GACACGGACG CAGGCTGAGG GGCAGGGGCA GGGGCGCGGG CTGGGCCTGG 660 
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TATCCGCCGT CCGCGCCGGG GCGTCCTGGG GATCCCCGCC TGCCTGCCCC GGAGCCCGAQ 720 

CTGCACCCQA GCCOCGOQAC OCGCCTGCGC TGTTCGCCGA OOCGGCGTGC COGGAAGOGC 780 

AGGCGATGGA GCCAAGCCCA CCGTCCCC6C TCCTTCCGGG CGACGTGCTG GCTCTGTTGG 840 

AGACCTGGAT GCCCCTCTCG CCTCTGGAGT GGCTGOCTGA GGAGCCCAAG TGACAAGGGA 900 

CAACT6ACGC CGTCTCTGTG AGCACCGA6G CTTTTTGGCC TCAGCACCTT CGAA6TG6TT 960 

CCTTGGCAGA CTGCCTTTCC TGGAAGAGGG CACGGGCGAT CCCGACGGGG GCATTCCTGC 1020 

GGGTGAGAGC CGTCCCCACC GCGGCGGCCC TTCTCAGCCC CTCCCTCCAT GGAGGGACCC 1080 

ATAGGGCTAG ACACTTTGAG GCAAGCAGQA GGCTCTGC C T AATGTGAATT TATTTATTTG 1140 
TGAATAAACT GTACTGGTGT CAAAAAAAAA AAAAAAAAAA A 



SEQ H) MO: 104 PEU4 Protein sequence 
Proton Accession*: NP061 140 



1 11 21 31 41 51 

I I I I I I 

MAQPLCPPLS ESWHLSAAWG PTRRPPPSDK DCGRSLVSSP DSWGSTPADS PVASPARPGT 60 
LROFRAPSVG RRGARSSRLG SGQRQSASER EKLRMRTLAR ALHELRRFLP PSVAPAGQSL 120 
TKIETLRLAI RYIGHLSAVL GLSEESLQRR CRQRGDAGSP RGCPLCPDDC PAQMQTRTQA 180 
EGQGQGRGLC LVSAVRAGAS WGSPPACPGA HAAPEPRDPP ALFAEAACPE GCAMEPSPPS 240 
PLLPGDVLAL LETWMPLSPL EWLPEEPK 

SEQ ID NO-.105 PEU5 DNA SEQUENCE 

Nucleic Acid Accession!: NM.017636 

Coding sequence: 324-3374 (underlined sequences correspond to start and step codons) 



1 11 21 31 41 51 

I I I I I I 

CCACGGAGAA GCCCACCGAT GCCTACGGAG AGCTGGACTT CACGGGGGCC GGCCGCAAGC 60 

ACAGCAATTT CCTCCGGCTC TCTGACCGAA CGGATCCAGC TGCAGTTTAT AGTCTGGTCA 120 

CACGCACATG GGGCTTCCGT GCCCCGAACC TGGTGGTGTC AGTGCTGGGG GGATCGGGGG 180 

GCCCCGTCCT CCAGACCTGG CTGCAGGACC TGCTGCGTCG TGGGCTGGTG CGGGCTGCCC 240 

AGAGCACAGG AGCCTGGATT GTCACTGGGG GTCTGCACAC GGGCATCGGC CGGCATGTTG 300 

GTGTGGCTGT ACGGGACCAT CAGATGGCCA GCACTGGGGG CACCAAGGTG GTGGCCATGG 360 

GTGTGGCCCC CTGGGGTGTG GTCCGGAATA GAGACACCCT CATCAACCCC AAGGGCTCGT 420 

TGOCTGCGAG GTACCGGTGG CGCGGTGACC CGGAGGACGG GGTCCAGTTT CCCCTGGACT 480 

ACAACTACTC GGOCTTCTTC CTGGTGGACG ACGGCACACA CGGCTGCCTG GGGGGCGAGA 540 

ACCGCTTCCG CTTGCGCCTG GAGTCCTACA TCTCACAGCA GAAGACGGGC GTGGGAGGGA 600 

CTGGAATTGA CATCCCTGTC CTGCTCCTCC TGATTGATGG TGATGAGAAG ATGTTGACGC 660 

GAATAGAGAA CGCCACCCAG GCTCAGCTCC CATGTCTCCT CGTGGCTGGC TCAGGGGGAG 720 

CTGCGGACTG CCTGGCGGAG A COCTG GAAG ACACTCTGGC CCCAGGGAGT GGGGGAGCCA 780 

GGCAAGGCGA AGCCCGAGAT CGAATCAGGC GTTTCTTTCC CAAAGGGGAC CTTGAGGTCC 840 

TGCAGGCCCA GGTGGAGAGG ATTATGACCC GGAAGGAGCT CCTGACAGTC TATTCTTCTG 900 

AGGATGGGTC TGAGGAATTC GAGACCATAG TTTTGAAGGC CCTTGTGAAG GCCTGTGGGA 960 

GCTCGGAGGC CTCAGCCTAC CTGGATGAGC TGCGTTTGGC TGTGGCTTGG AACCGCGTGG 1020 

ACATTGCCCA GAGTGAACTC TTTCGGGGGG ACATCCAATG GCGGTCCTTC CATCTCGAAG 1080 

CTTCCCTCAT GGACGCCCTG CTGAATGACC GGCCTGAGTT CGTGCGCTTG CTCATTTCCC 1140 

ACGGCCTCAG CCTGGGCCAC TTCCTGACCC CGATGCGOCT GGCCCAACTC TACAGCGCGG 1200 

CGCCCTCCAA CTCGCTCATC CGCAACCTTT TGGACCAGGC GTCCCACAGC GCAGGCACCA 1260 

AAGCCCCAGC CCTAAAAGGG GGAGCTGCGG AGCTCCGGCC CCCTGACGTG GGGCATGTGC 1320 

TGAGGATGCT GCTGGGGAAG ATGTGCGCGC CGAGGTACCC CTCCGGGGGC GCCTGGGACC 1380 

CTCACCCAGG CCAGGGCTTC GGGGAGAGCA TGTATCTGCT CTCGGACAAG GCCACCTCGC 1440 

CGCTCTCGCT GGATGCTGGC CTOGGGCAGG CCCCCTGGAG CGACCTGCTT CTTTGGGCAC 1500 

TGTTGCTGAA CAGGGCACAG ATGGCCATGT ACTTCTGGGA GATGGGTTCC AATGCAGTTT 1560 

CCTCAGCTCT TGGGGOCTGT TTGCTGCTCC GGGTGATGGC ACGCCTGGAG CCTGACGCTG 1620 

AGGAGGCAGC AOGGAGGAAA GACCTGGCGT TCAAGTTTGA GGGGATGGGC GTTGACCTCT 1680 

TTGGCQAGTG CTATCGCAGC AGTGAGGTGA GGGCTGCCCG CCTCCTCCTC CGTOGCTGCC 1740 

CGCTCTGGGG GGATGCCACT TGCCTCCAGC TGGCCATGCA AGCTGAOGCC CGTGCCTTCT 1800 

TTGCCCAGGA TGGGGTACAG TCTCTGCTGA CACAGAAGTG GTGGGGAGAT ATGGCCAGCA 1860 

CTACACCCAT CTGGGCCCTG GTTCTCGCCT TCTTTTGCCC TCCACTCATC TACACCCGCC 1920 

TCATCACCTT CAGGAAATCA GAAGAGGAGC CCACACGGGA GGAGCTAGAG TTTGACATGG 1980 

ATAGTGTCAT TAATGGGGAA GGGCCTGTCG GGACGGCGGA CCCAGCCGAG AAGACGCCGC 2040 

TGGGGGTCCC GOGCCAGTCG GGCCGTCCGG GTTGCTGCGG GGGCCGCTGC GGGGGGCGCC 2100 

GGTGCCTACG COGCTGGTTC CACTTCTGGG GCGCGCCGGT GACCATCTTC ATGGGCAACG 2160 

TGGTCAGCTA CCTGCTGTTC TTGCTGCTTT TCTCGCGGGT GCTGCTCGTG GATTTCCAGC 2220 

CGGCGCCGCC CGGCTCCCTC GAGCTGCTGC TCTATTTCTG GGCTTTCACG CTGCTGTGCG 2280 

AGGAACTGCG CCAGGGCCTG AGCGGAGGCG GGGGCAGCCT CGCCAGCGGG GGGCCCGGGC 2340 

CTGGCCATGC CTCACTGAGC CAGCGCCTGC GCCTCTACCT CGCCGACAGC TGGAACCAGT 2400 

GCGACCTAGT GGCTCTCACC TGCTTCCTCC TGGGCGTGGG CTGCCGGCTG ACCCOGGGTT 2460 

TGTACCACCT GGGCCGCACT GTCCTCTGCA TCGACTTCAT GGTTTTCACG GTGCGGCTGC 2520 

TTCACATCTT CACGGTCAAC AAACAGCTGG GGCCCAAGAT OGTCATCGTG AGCAAGATGA 2580 

TGAAGGACGT G T TCT l t- 1 ' 1^ CTCTTCTTCC TCGGCGTGTG GCTGGTAGCC TATGGCGTGG 2640 

CCACGGAGGG GCTCCTGAGG CCACGGGACA GTGACTTCCC AAGTATCCTG CGCCGCGTCT 2700 

TCTACCGTCC CTACCTGCAG ATCTTCGGGC AGATTCCCCA GGAGGACATG GACGTGGCCC 2760 

TCATGGAGCA CAGCAACTGC TCGTCGGAGC CCGGCTTCTG GGCACACCCT CCTGGGGCCC 2820 

AGGCGGGCAC CTGCGTCTCC CAGTATGCCA ACTGGCTGGT GGTGCTGCTC CTCGTCATCT 2880 

TCCTGCTCGT GGCCAACATC CTGCTGGTCA ACTTGCTCAT TGCCATGTTC AGTTACACAT 2940 

TCGGCAAAGT ACAGGGCAAC AGCGATCTCT ACTGGAAGGC GCAGCGTTAC CGCCTCATCC 3000 

GGGAATTCCA CTCTCGGCCC GCGCTGGCCC CGCCCTTTAT CGTCATCTCC CACTTGCGCC 3060 

TCCTGCTCAG GCAATTGTGC AGGCGACCCC GGAGCCCCCA GCCGTCCTCC CCGGCCCTCG 3120 

AGCATTTCCG GGTTTACCTT TCTAAGGAAG CCGAGCGGAA GCTGCTAACG TGGGAATCGG 3180 
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TGCATAAGGA GAACTTTCTG CTGGCACGCG CTAGGGACAA GCGGGAGAGC GACTCCGAGC 3240 

GTCTGGAGCG CACGTCCCAG AAGGTGGACT TGGCACTGAA ACAGCTGGGA CACATCCGCG 3300 

AGTACGAACA GCGCCTGAAA OTGCTGGAGC GGGAGGTCCA GCAGTGTAGC CGCGTCCTGG 3360 

GGTGGGTGAC GTAGGCCGTT ASCAGCTCTG CCATGTTGCC CTCAGGTGGG GCGCCAOCCC 3420 

TTQACCTGCA TGGGTCCAAA GAGTGAGCCA TGCTGGCGOA TTTTAAGGAG AAGCCCCCAC 3480 

AGGGGATTTT GCTCTTAGAG TAAGGCTCAT GTGGGCCTCQ GCCCCOGCAC CTGGTGGCCT 3540 

TGTCCTTGAG GTGAGCCCCA TGTCCATCTG GGCCACTGTC AGGACCACCT TTGGGAGTGT 3600 

CATCCTTACA AACCACAGCA TGCCCGGCTC CTCCCAGAAC CAGTCCCAGC CTGGGAGGAT 3660 

CAA6GCCT6G ATCCCG66CC GTTATCCATC TG6AGGCTGC AGGGTCCTTG GGGTAACAGG 3720 

GACCACAGAC CCCTCACCAC TCACAGATTC CTCACACTGG GGAAATAAAG CCATTTCAGA 3780 
GGAAAAAAAA AAAAAAAAAA AAAAAAAAAA 

Protein Accession t NPJ>60106 



1 11 21 31 41 51 

I I I I I I 

MASTGGTKW AHGVAPWGW RNRDTLINPK GSFPARYKWR GDPEDGVQFP LDYNYSAFPL 60 

VDDCTHGCLG GENRFRLRLE SYISQQKTGV GGTGIDIPVL LLLZDGDEKM LTRIENATQA 120 

QLPCLLVAGS GGAADCLAET LEDTLAPGSG GARQGEARDR 1RRFFPKGDL EVLQAQVERI 180 

HTRKELLTVY SSBDGSEEFB TIVLKALVKA CGSSEASAYL DELRLAVAWN RVDIAQSELF 240 

RGDIQWRSFH LEASLMDALL NDRPEFVRLL ISHGLSLGHF LTFMRLAQLY SAAPSNSLIR 300 

NLLOQASKSA GTKAPALKGG AAELRPPDVG HVLRMLLGKM CAPRYPSGGA WDPHPGQGFG 360 

ESHYLLSDKA TSPLSLDAGL GQAPWSDLLL WALLLNRAQM AMYFWEMGSN AVSSALGACL 420 

LLRVMARLEP DAEEAARRKD LAFKFEGMGV DLFGECYRSS EVRAARLLLR RCPLWGDATC 480 

LQLAMQADAR AFFAQDGVQS LLTQKWWGDM ASTTPIWALV LAFFCPPLIY TRLITFRKSE 540 

EEPTREELEF DMDSVTNGBG PVGTADPAEK TPLGVPKQSG RPGCCGGRCG GRRCLREWPH 600 

FWGAFVTIFM GNWSYLLFL LLFSRVLLVD FQPAPP6SLE LLLYFWAFTL LCEELRQGLS 660 

GGGGSLASGG PGPGHASLSQ RLRLYLADSW NQCDLVALTC PLLGVGCRLT PGLYHLGRTV 720 

LCIDFHVFTV RLLHIFTVNK QLGPKIVTVS KMHKDVFPFL FPLGVWLVAY GVATEGLLRP 780 

RDSDFPSILR RVFYRPYLQI FGQIPQEDMD VALMEHSNCS SEPGFWABPP GAQAGTCVSQ 840 

YANWLWLLL VIFLLVANIL LVNLLIAMFS YTFGKVQGHS DLYWKAQRYR LIREFHSRPA 900 

LAPPFIVISH LRLLLRQLCR RFRSPQPSSP ALEHFRVYLS KEAERKLLTW ESVHKENFLL 960 
ARARDKRESD SERLERTSQK VDLALKQLGH IREYEQRLKV LEREVQQCSR VLGWVT 

SEC 10 N0:107 PEWS DMA SEQUENCE 

Nucleic Add Accession #: NM_0059S2 

Coding sequence 276-1 130 (underlined sequences correspond to start and stop codons) 



1 11 21 

I I I 

GGTAGCAGCA TCCACCGGGC GGGAGGTCGG 
CGCCGGCCGT TCCGTGTCCA GAACCTCCCC 
CACOGCCAAG TTCCGACTCC GGTTTTCGCC 
AGCOGCGCCC CCCTCCCTGC GGCCGCCGOC 
TGCGCCTGGQ CCGTGCGCCC CGGCAGGCGC 
TTACGCAGGA GCAAGTGGCG T6CGTGTGCG 
GCCTGGGCAG GTTCCTGTGG TCACTGCCCG 
TACTCAAGGC CAAGGCGGTG CTOGOCTTCC 
TCCTGGAGAG CCACCAGTTC TCGCCTCACA 
AGGCGCATTA CGTGGAGGCC GAGAAGCTGC 
ATCGGGTGCG CCGAAAATTT CCACTGCCGC 
ACTGCTTCAA GGAGAAGTCG AGGG0T6TCC 
CATCGCCGCG TGAGAAGCGG GAGCTGGCCG 
GCAACTGGTT TAAGAACCGG AGGCAAAGAG 
ACACCGAAAA CAATAACTCC TCCTCCAACA 
GCAAGCCGCT CATGTCCAGC TCAGAAGAGG 
ACTCGGTCCT TCTGCTGCAG GGCAATATGG 
CGGGCTTAAC AGCCTCGCAG CCCAGTCACG 
ACTCTCTGCT CGGCCCCCTC ACCTCCAGTC 
ACTGGGGCCT CGAAGGGATT CCTGGAGCAG 
AATAGAAATC AGGAACATTT TTGCAGCTTG 
GTGGACTTTC ACAAATATCT TTTTAAAAAT 
CCTCTTCTCT CCAACTCTTT CCACTTTTGC 



31 41 51 

I I 1 

AGGCAGCAAG GCCTTAAAGG CTACTGAGTG 60 

TACTCCTCCG CCTTCTCTTC CTTGGCOGCC 120 

TTTGCAAAGC CTAAGGAGGA GGTTAGGAAC 180 

CCCTGOCTCT CGGCTCTGCT CCCTGCCGCG 240 

CAGCCATGTC GATGCTGCCG TCGTTTGGCT 300 

AGGTTCTGCA GCAAGGCGGA AACCTGGAGC 360 

CCTGCGACCA CCTGCACAAG AACGAGAGCG 420 

ACCGCGGCAA CPTCCGTGAG CTCTACAAGA 480 

ACCACCCCAA ACTGCAGCAA CTGTGGCTGA 540 

GCGGCCGACC CCTGGGCGCC GTGGGCAAAT 600 

GCACCATCTG GGACGGCGAG GAGACCAGCT 660 

TGCGGGAGTG GTAOGCGCAC AATOCCTACC 720 

AGGCCACCGG CCTCACCACC ACCCAGGTCA 780 

ACCGGGCOGC GGAGGCCAAG GAAAGGGAGA 840 

AGCAGAACCA ACTCTCTCCT CTGGAAGGGG 900 

AATTCTCACC TCCCCAAAGT CCAGACCAGA 960 

GCCACGCCAG GAGCTCAAAC TATTCTCTCC 1020 

GCCTGCAGAC CCACCAGCAT CAGCTCCAAG 1080 

TGGTGGACTT GGGGTC CTAA GTGGGGAGGG 1140 

CAACCACTGC AGCGACTAGG GACACTTGTA 1200 

TTTCTGGAGT TGTTTGCGCA TAAAGGAATG 1260 

CAAAACCAAC AGCGATCTCA AGCTTAATCT 1320 
ATTTTCCTTC CCAATGCAGA GATCAGGG 



SEQ 10 NQ:108 PEW3 Protein sequence 
Protein Accession #: NP.005973 

1 11 21 31 41 51 

I I I I I I 

MSMLPSFGFT QEQVACVCEV LQQGGNLERL GRFLWSLPAC DHLHKNESVL KAKAWAFHR 60 

GNFRELYKIL ESHQFSPHNH PKLQQLWLKA HYVEAEKtRG RPLGAVGKYR VRRKFPLPRT 120 

IWDGEETSYC FKEKSRGVLR EWYAHNPYPS PREKRELAEA TGLTTTQVSN WFKNRRQRDR 180 

AAEAKERENT ENNNSSSNKQ NQLSPLEGGK PLMSSSEEEF SPPQSPDQNS VLLLQGNMGH 240 
ARSSNYSLPG LTASQPSHGL QTHQHQLQDS LLGPLTSSLV DLGS 

SEQ ID N0:1Q9 PFJ8 0NA SEQUENCE 

Nucteic Add Accession*: NMJW5069 

Cooing sequence: 57-2060 (umfefflnad sequences correspond to start and stop codons) 
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PCT7US01/32045 



1 11 21 31 41 SI 
I I I I I I 

GGGGCTCCGC GGGCCTGGAG CACGGCCGGG TCTAATATGC CCGGAGCCGA GGCGCGAJQA 60 
AGO AG AAGTC CAAG AATGCG GCCAAG ACCA GG AGGG AGAA GGAAAATGGC GAGTTTTACG 120 
5 AGCTTGCCAA GCTGCTCCCG CTGCCGTCGG CCATCACTTC GCAGCTGGAC AAAGCGTCCA 180 
TCATCCGCCT CACCACGAGC TACCTGAAGA TGCGCGCCGT CTTCCCCGAA GGTTTAGGAG 240 
ACGCGTGGGG ACAGCCGAGC CGCGCCGGGC OCCTGG ACGG CGTCGCCAAG GAGCTGGGAT 300 
CGCACTTGCT GCAGACTTTG GATGGATTTG 1111 1UTGGT AGCATCTGAT GGCAAAATCA 360 
TGTATATATC CGAG ACOGCT TCTGTCCATT TAGGCTTATC CCAGGTGG AG CTCACGGGCA 420 

10 ACAGTATTTA TGAATACATC CATCCTTCTG ACCACGATG A GATGACCGCT GTOCTCACGG 480 
CCCACCAGCC GCTGCACCAC CACCTGCTCC AAGAGTATGA GATAGAGAGG TCGTTCTTTC 540 
TTCGAATGAA ATGTGTCTTG GOGAAAAGGA ACGGGGGGCT GACCTGCAGC GGATACAAGG 600 
TCATCCACTG CAGTGGCTAC TTGAAGATCA GGCAGTATAT GCTGGACATO TCCCTGTACG 660 
ACTCCTGCTA CCAGATTGTG GGGCTGGTGG CCGTGGGCCA GTCGCTGCCA CCCAGTGCCA 720 

15 TCACCGAG AT CAAGCTGTAC AGTAACATGT TCATGTTCAG GGCCAGCCTT GACCTGAAGC 780 
TGATATTCCT GGATTCCAGG GTG ACCGAGG TGACGGGTTA GGAGCCGCAG GACCTGATCG 840 
AG AAGACCCT ATACCATCAC GTGCACGGCT GCGACGTGTT OCACCTCCGC TACGCACACC 900 
ACCTOCTGTT GGTGAAGGGC CAGGTCACCA GCAAGTACTA CCGGCTGCTG TOCAAGCGGG 960 
GCGGCTGGGT GTGGGTGCAG AGCTAGGOCA CCGTGGTGCA CAACAGCCGC TCGTCCCGGC 1020 

20 CCCACTGCAT CGTGAGTGTC AATTATGTAC TCACGG AGAT TGAATACAAG GAACTTCAGC 1080 
TGTCCCTGG A GCAGGTGTCC ACTGCCAAGT CCCAGGACTC CTGGAGGACC GCCTTGTCTA 1140 
CCTCACAAG A AACTAGG AAA TTAGTGAAAC CCAAAAATAC CAAG ATGAAG ACAAAGCTGA 1200 
GAACAAACCC TTACOCOCCA CAGCAATACA GCTCGTTCCA AATGG ACAAA CTGGAATGCG 1260 
GCCAGCTCGG AAACTGGAGA GCCAGTCCCC CTGCAAGCGC TGCTGCTCCT OCAGAACTGC 1320 

25 AGOCCCACTC AGAAAGCAGT GACCTTCTGT ACACGCCATC CTACAGCCTG CCCTTCTCCT 1380 
AOCATTAGGG ACACTTOCCT CTGG ACTCTC ACGTCTTCAG CAGCAAAAAG OCAATGTTGC 1440 
OGGCCAAGTT CGGGCAGCCC CAAGGATCCC CTTGTGAGGT GGCACGCTTT TTCCTGAGCA 1500 
CACTGCCAGC CAGCGGTGAA TGCCAGTGGC ATTATGCCAA CCCCCTAGTG CCTAGCAGCT 1560 
OGTCTCCAGC TAAAAATGCT CCAGAGCCAC CGGCGAACAC TGCTAGGCAC AGCCTGGTGC 1620 

30 CAAGCTACGA AGCGCCCGCC GCOGCCGTGC GCAGGTTCGG CGAGGACACC GCGCCCCCGA 1680 
GCTTOCCGAG CTGCGGCCAC TAOCGCGAGG AGCCCGCGCT GGGCC€GGCC AAAGCCGGCC 1740 
GCCAGGCCGC CCGGGACGGG GCGCGGCTGG CGCTGGCCCG CGCGGCACCC GAGTGCTGCG 1800 
CGCCCCCGAC CCCCGAGGCC CCGGGCGCGC CGGCGCAGCT GCCCTTCGTG CTGCTCAACT 1860 
ACCACCGCGT GCTGGCCCGG CGCGGACCGC TGGGGGGCGC CGCACCCGCC GCCTCCGGCC 1920 

35 TGGCCTGCGC TCCCGGCGGC CCOGAGGCGG CGACCGGCGC GCTGCGGCTC CGGCACCCGA 1980 
GCCCCGCCGC CACCTCCCCG CCCGGCGCGC CCCTGCCGCA CTACCTGGGC GCCTCGGTCA 2040 
TCATCACCAA CGGGAGGTQA CCCGCTGGCC GCCCGCGCCA GGAGCCTGGA CCCGGCCTCC 2100 
CGGGGCTGCG GCGCCACCGA GCCCGGCAAA TGCGCACGAC CTACATTAAT TTATGCAGAG 2160 
ACAGCTGTTT G AATTGGACC CCGCCGCCG A CTTGCGGATT TCCACCGCGG AGGCCCCGCG 2220 

40 CGCCGGTGCC GAGGGOCG AG GAGCGCCCX3G GTCCGGGCAG GTGACCGCCC GCCTCTGTCC 2280 
TGCG AGGGCC GGTGCGACGC AGTTGCTGGG GGCTTGGTTT CCTCACCTTG AAATCGGGCT 2340 
TCACGCGTCT TGCCTTGTCC CCAACGTTCC ACAACAGTCC CGCTGGGGGA TTGAAGCGGT 2400 
7TCACTCCGC AAATATOCTC CACTTTCAGG AGGGAAAACC CACCCTACCA CAGTCCGCTC 2460 
TTCCAAGTGG ACGGCAGACC TGGGAGGGGA CGCCTGTGTC ACGAGCOCTT TTAGATGCTT 2520 

45 AGGTGAAGGC AGAAGTG ATG ATTGTAAGTC OCATGAATAC ACAACTCCAC TGTCTTTAAA 2580 
AGTCATTCAA G AGTCTCATT ATTTTTGTTT TTATTTAACC CTTTCTTCAA TACAAAAAGC 2640 
CAACAAACCA AGACTAAGGG GGTGACCATG CAATTCCATT TTGTGTCTGT G AACATAGGT 2700 
GTGCTTCCCA AATACATTAA CAAGCTCTTA CTTOCCCCTA AOCOCTATGA ACTCTTGATA 2760 
ACACCAAG AG TAGCACCTTC AG AATATATT GAATAGGCAT TAAATGCAAA AATATATATG 2820 

50 TAGCCAGACA GTTTATGAGA ATGACCCTGT CAAOCTTCAT TATTACGTGG CAAAATCCCT 2880 
CTGGCCCACA CAGATCTGTA ATTCACTAGG CTC GT GTTTG CTACAAATAG TGCTAATAAA 2940 
GTTAAATTGC ACGTGCAATA GGG AACACTG TCAATGG ACT GCACCTTGTG AAGG AAAAAC 3000 
ATGCTTAAGG GGGTGTAATG AAAATG ATGT AGACATTTTA AGCATTTTCT ACACAGGGAG 3060 
AAAACTTCGT AAGAACATGT TACGTGTGCA ACAGGTAAAC AGAAATCCTT TCATAAAGCA 3120 

55 CCAGCAGTGT TTAAAAAATG AGCTTCCATT AATTTTTACT TTTTATGGGT TTTGCTTAAA 3180 
GATCTCAACA TGGAAAAATC CTGTCATGGC TCTGAACTGC ACAATGCATT G AACCGCCGT 3240 
CCTTCAATTT TCTTCACACT ATCAACACTG CAGCATTTTG CTGCTTTATC AAAATGGTTT 3300 
ATTTTAGGAA ACTTTTTOCA CCTTTCTGAA TGGAAAGAGG TTTTCACAAA TGTTTTAAAC 3360 
TCATCGTTCT AAAATCAAGT GCACCTACAC CAACTGCTCT CAAAATGTGA ACTGACTTTT 3420 

60 TTITTTTTTT TTTTGCCAAC CCTGTGTCAC TTAGTG AGG A CCTGACACAA TCCCTACAGG 3480 
GTGTCTGTCA GTGGGCCTCA TGGTAAGAGT CACAATTTGC AAATTTAGGA CCGTGGGTCA 3540 
TGCAGCG AAG GGGCTGGATG GTAGGAAGGG ATGTGCCCGC CTCTCCACGC ACTCAGCTAT 3600 
ACCTCATTCA CAGCTCCTTG TGAGTGTGTG CACAGGAAAT AAGCCGAGGG TATTATTTTT 3660 
TTATGTTCAT GAGTCTTGTA ATTAAACCGT GATTCTTGAA AGGTGTAGGT TTGATTACTA 3720 

65 GGAGATACCA CCGACATTTT TCAATAAAGT ACTGCAAAAT GCTTTTGTGT CTAOCTTGTT 3780 
ATTAACTTTT GGGGCTGTAT TTAGTAAAAA TAAATCAAGG CTATCGGAGC AGTTCAATAA 3840 
CAAAGGTTAC TGTTGAGAAA AAAGACCCTA TCATAGATTT ACAAG 



70 

SEQ ID WO:110 PFJ8 Protein seouencg 
Protein Accession*: NP.OOS060.1 

1 11 21 31 41 51 
75 | | | i | | 

MKEKSKNAAK TRREKENGEF YELAKLLPLP SAITSQLDKA SHRLTTSYL KMRAVFPEGL 60 
GDAWGQPSRA GPLDGVAKEL GSHLLQTLDG FVFWASDGK IMYBETASV HLGLSQVELT 120 
GNSIYEYMP SDHDEMTAVL TAHQPLHHHL LQEYED3RSF FLRMKCVLAK RNAGLTCSGY 180 
KVIHCSGYUC IRQYMLDMSL YDSCYQFVGL VAVGQSLPPS ATTEIKLYSN MFMFRASLDL 240 
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KUFLDSRVT EVTGYEPQDL EKTLYHHVH GCDVFHLRYA HHLLLVKGQV TTKYYRLLSK 300 
RGOWVWVQSY ATWHNSRSS RPHCIVS VNY VLTEIEYKEL QLSLEQVSTA KSQDSWRTAL 360 
STSQETRKLV KPKNTKMKTK LRTNPYPPQQ YSSPQMDKLE CGQLGNWRAS PPASAAAPPE 420 
LQPHSESSDL LYTPSYSLPF SYHYGHFPLD SHVFSSKKPM LPAKFGQPQG SPCEVARFFL 480 
STLPASGECQ WHYANPLVPS SSSPAKNFPE PPANTARHSL VPSYEAPAAA VRRFGEDTAP 540 
PSFPSCGHYR EEPALGPAKA ARQAARDGAR LALARAAPEC CAPPTPEAPG APAQLPFVLL 600 
NYHRVLARRG PLGGAAPAAS GLACAPGGPE AATGALRLRH PSPAATSPPG APLPHYLGAS 660 
VHTNGR 



SEQ IDNO.111 PFJ7DNA SEQUENCE 

Nudeic Add Accession t NM.006549 

Coding sequence: 1-1254 (underlined sequences correspond to start and stop codons) 

1 1! 21 31 41 51 
I I I I I I 

ATGAACGGAC GCTGCATCTG CCCGTCCCTG CCCTACTCAC CCGTCAGCTC CCCGCAGTCC 60 
TCGCCTCGGC TGCCCCGGCG GCCGACAGTG GAGTCTCACC ACGTCTCCAT CAOGGGTATG 120 
CAGGACTGTG TGCAGCTGAA TCAGTATACC CTGAAGGATG AAATTGGAAA GGGCTCCTAT 180 
GGTGTCGTC A AGTTGGOCTA CAATGAAAAT GACAATACCT ACTATGCAAT GAAGGTGCTG 240 
TCCAAAAAGA AGCTG ATCCG GCAGGCCGGC TTTCCAOGTC GCGCTCCACC CCGAGGCACC 300 
CGGCCAGCTC CTGGAGGCTG CATCCAGCCC AGGGGCCCCA TTGAGCAGGT GTACCAGGAA 360 
ATTGCCATCC TCAAGAAGCT GGACCACCCC AATGTGGTGA AGCTGGTGGA GGTCCTGGAT 420 
GAOCCCAATG AGG ACCATCT GTACATGGTG TTCGAACTGG TCAACCAAGG GCOOGTGATG 480 
GAAGTGCCCA CCCTCAAACC ACTCTCTGAA GACCAGGCCC GTTTCTACTT CCAGGATCTG 540 
ATCAAAGGCA TCGAGTACTT ACACTACCAG AAGATCATCC AGCGTGACAT CAAACCTTCC 600 
AACCTCCTGG TCGGAGAAGA TGGGCACATC AAGATCGCTG ACTTTGGTGT GAGCAATGAA 660 
TTCAAGGGCA GTGACGCGCT CXTCTCCAAC ACCGTGGGCA CGCCCGCCTT CATGGCACCC 720 
GAGTCGCTCT CTGAGACCCG CAAGATCTTC TCTGGGAAGG CCTTGGATGT TTGGGCCATG 780 
GGTGTGACAC TATACTGCTT TGTCTTTGGC CAGTGCCCAT TCATGGACGA GCGGATCATG 840 
TGTTTACACA GTAAGATCAA GAGTCAGGCC CTGGAATTTC CAGACCAGCC CGACATAGCT 900 
GAGGACTTG A AGGACCTG AT CAOCCGTATG CTGGACAAGA ACCCCGAGTC GAGGATCGTG 960 
GTGOCGGAAA TCAAGCTGCA CCCCTGGGTC AOGAGGCATG GGGCGQAGCC GTTGCCGTCG 1020 
GAGGATGAGA ACTGCACGCT GGTCGAAGTG ACTGAAGAGG AGGTCG AGAA CTCAGTCAAA 1080 
CACATTCCCA GCTTGGCAAC CGTGATCCTG GTGAAGACCA TGATACGTAA ACGCTCCTTT 1140 
GGG AACCCAT TCG AGGGCAG CCGGCGGG AG G AAGGCTCAC TGTCAGCGCC TGGAAACTTG 1200 
CTCACCAAAA AACCAACCAG GGAATGTGAG TCOCTGTCTG AGCTCAAGAC CIAQAAAATA 1260 
AGTCCCCTTC CTGCCTGTTG CAAAGTAACG TAAGAGTTCC CTCACCCGAG TGGATGCAGA 1320 
CGUCUGCT GTCAGCCACC TTCCTTCATA CACATAGCCA GCCCAGGGTG ACCAGAACGT 1380 
CCCAGG ACAG ATGAGGCTTT GTGTCCTTAT GAGAGTGGGA GAACCTGGTG GGCACCCCTG 1440 
GTGCAGGTGC TGTGGTGGGT GGGGACCCCA CTGCCTTTCC CACTGAGCAC ATCATGGCTA 1500 
CCTGACTTGG TGGGAGTTCC ATTCAGTCAC TTCIXjTTTCT TAAACATAGC TTTACTGAGG 1560 
TACAATTCAC ATACCATGTA ATTCACCCAC GGGAAGTGTA TGATTCAGTG GTTTCTAATA 1620 
CACACTTCTG CAGCCATTAC CACCGTCAAC TTTACGACAT TTTCATCAGC CCAAGAAGAC 1680 
ACCCTACACT CCTTAGCTGT CCCCATCCAA CTCCCCCACC CCAGTAACCA CTCAGAATAG 1740 
GTATGGATTT GCCTATTCTG G ACGTTTCGT ATAAATGGCG TCATACACTA AAAAAAAAAA 1800 
AAAA 



Protein Accession #: KP.006540.1 

1 11 21 31 41 51 
I I I 1 I I 

MNGRCICPSL PYSPVSSPQS SPRLPRRPTV ESHHVSITGM QDCVQLNQYT LKDEJGKGSY 60 
GWKLAYNEN DNTYYAMKVL SKKKURQAG FPRRPPPRGT RPAPGGOQP RGPIEQVYQE 120 
IAILKKLDHP NWKLVEVLD DPNEDHLYMV FELVNQGPVM EVPTLKPLSE DQARFYFQDL 180 
KGIEYLHYQ KIIHRDIKPS NLLVGEDGHI KIADPGVSNE FKGSDALLSN TVGTPAFMAP 240 
ESLSETRKIF SGKALDVWAM GVTLYCFVFG QCPEMDERIM CLHSKKSQA LEFPDQPDIA 300 
EDLKDUTRM LDKNPESRIV VPEIKLHPWV TRHGAEPLPS EDENCTLVEV TEEEVENSVK 360 
mPSLATVIL VKTMIRKRSF GNPFEGSRRE ERSLSAPGNL LTKKPTRECE SLSELKT 



SEQ ID N0:113 PFJ6 DMA SEQUENCE 

Nucleic Add Accession t: NM.O21810 

Coding sequence 1 -429 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I I I I I 

ATGAAACCTC TGATATGGAC ATGGTCAG AT GTTGAAGGCC AGAGGCCGGC TCTGCTCATC 60 
TGCACAGCTG CAGCAGGACC CACGCAGGGA GTTAAGGGTT ATGGCAAGCC CTTTGAGCCA 120 
AGAAGTGTGA AAAACATACA CTCTACTCCT GCTT ACOCAG ATGCCACAAT GCACAGACAA 180 
CTCCTGGCTC CGGTGG AAGG AAGGATGGCA GAGACATTG A ATCAGAAACT CCATtnTGCC 240 
AATGTGCTGG AAGATG ACCC CGGCTACCTA CCTCACGTCT ACAGCGAGGA AGGGG AGTGT 300 
GGAGGGGCCC CATCCCTCAG CTCTCTGGCC AGCTTGGAAC AGGAGTTGCA ACCTGATTTG 360 
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CTGGACTCTT TGGGTTCAAA AGCGACTCCG TTTGAGGAAA TATATTCAGA GTCAGGTGTT 420 
CCTTCCTAA 

Prate? Acc*&n fc* Pfrt ^!.oS.1 

1 11 21 31 41 51 
I I I I I I 

MKPUWTWSD VEGQRPALLI CTAAAGPTQG VKGYGKPFEP RS VKNIHSTP AYPDATMHRQ 60 
LLAPVEGRMA ETLNQKLHVA NVLEDDPGYL PHVYSEEGEC GGAPSLSSLA SLEQELQPDL 120 
LDSLGSKATP FEEIYSESGV PS 



SEQ ID N0:115 PFJ5 DNA SEQUENCE 

Nucleic Acid Accession #: NM.Q08361 

Coding sequence: 131-985 (undefined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I I I I I 

CGAATGCAGG CGACTTGCGA GCTGGGAGCG ATTTAAAACG CTTTGGATTC CCCCGGCCTG 60 
GGTGGGGAGA GCGAGCTGGG TGCCCCCTAG ATTCCCCGCC CCCGCACCTC ATGAGCCGAC 120 
CCTCGGCTCC ATGGAGCCCG GCAATTATGC CACCTTGGAT GGAGCCAAGG ATATCGAAGG 180 
CTTGCTGGGA GCGGGAGGGG GGCGGAATCT GGTCGCCCAC TCCCCTCTGA CCAGCCACCC 240 
AGOGGCGCCT ACGCTGATGC CTGCTGTCAA CTATGCCCCC TTGGATCTGC CAGGCTCGGC 300 
GGAGOCGCCA AAGCAATGCC ACCCATGCCC TGGGGTGCCC CAGGGGACGT CCCCAGCTCC 360 
CGTGCCTTAT GGTTACTTTG GAGGCGGGTA CTACTCCTGC CGAGTGTCCC GGAGCTCGCT 420 
GAAACCCTGT GCCCAGGCAG CCACCCTGGC CGCGTACCCC GCGGAGACTC CCACGGCCGG 480 
GGAAGAGTAC CCCAGTCGCC CCACTQAGTT TCCCTTCTAT CCGGGATATC CGGGAACCTA 540 
CCACGCTATG GCCAGTTACC TGGACGTGTC TGTGGTGCAG ACTCTGGGTG CTCCTGG AGA 600 
ACCGCGACAT GACTCCCTGT TGCCTGTGGA CAGTTACCAG TOTGGGCTC TCGCTGGTGG 660 
CTGGAACAGC CAGATGTGTT GCCAGGGAGA ACAGAACCCA CCAGGTCCCT TTTGGAAGGC 720 
AGCATTTGCA GACTCCAGCG GGCAGCACCC TCCTGACGCC TGCGCCTTTC GTCGCGGCCG 780 
CAAGAAACGC ATTCCGTACA GCAAGGGGCA GTTGCGGGAG CTGGAGCGGG AGTATGCGGC 840 
TAACAAGTTC ATCACCAAGG ACAAGAGGCG CAAGATCTCG GCAGCCACCA GCCTCTCGGA 900 
GCGCCAGATT ACCATCTGGT TTCAGAACCG CCGGGTCAAA GAGAAGAAGG TTCTCGCCAA 960 
GGTGAAGAAC AGCGCTACCC CTTAAGAGAT CTCCTTGCCT GGGTGGGAGG AGCGAAAGTG 1020 
GGGGTGTOCT GGGGAGACCA GAAACCTGCC AAGCCCAGGC TGGGGCCAAG GACTCTGCTG 1080 
AGAGGCCCCT AGAGACAACA CCCTTCCCAG GCCACTGGCT GCTGGACTGT TCCTCAGGAG U40 
CGGCCTGGGT ACCCAGTATG TGCAGGGAGA CGGAACCCCA TGTGACAGGC CCACTCCACC 1200 
AGGGTTCCCA AAGAACCTGG CCCAGTCATA ATCATTCATC CTCACAGTGG CAATAATCAC 1260 
GATAACCAGT 



SEQ tDWO;116PFJ5 P fffteln se quen ce ; 
Protein Accession #: NP.006352.1 

1 11 21 31 41 51 
I I I I I I 

MEPGNYATLD GAKDEEGLLG AGGGRNLVAH SPLTSHPAAP TLMPAVNYAP LDLPGSAEPP 60 
KQCHPCPGVP QGTSPAPVPY GYFGGGYYSC RVSRSSLKPC AQAATLAAYP AETPTAGEEY 120 
PSRPTEFAFY PGYPGTYHAM AS YLDVSWQ TLGAPGEPRH DSLLPVDSYQ SWALAGGWNS 180 
QMCCQGJEQNP PGPFWKAAFA DSSGQHFPDA CAFRRGRKKR IPYSKGQLRE LEREYAANKF 240 
mCDKRRKIS AATSLSERQI TIWFQNRRVK EKKVLAKVKN SATP 



SEQ ID N0-.117 PFJ4 DNA SEQUENCE 

Nucleic Add Accession #: NM_005B28 

Coding sequence: 591-2216 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I I I I I 

GTAACCGCTA CTCCCGGACA CCAGACCACC GCCTTCCGTA CACAGGGGCC CGCATCCCAC 60 
CCTCCCGGAC CTAAGAGCCT GGGTCCCCTG TTTCCGGAGG TCCGCTTCCC GGCCCCCAGA 120 
TTCTGGCATC CCAGCCCTCA GTGTCCAAGA CCCAGGCAGC CCGGGTCCCC GCCTCCCGGA 180 
TCCAGGCGTC CGGGATCTGC GCCACCAGAA CCTAGCCTCC TGCAGACCTC CGCCATCTGG 240 
GGGCACTCAA CCTCCTGGAG CCAAGGGCCC CACGTCCCAC CCAG AGAAAC TCTCGTATTC 300 
CCAGCTCCTA GGGCCAAGGA ACCCGGGCGC TCCGAACTCC CAGCTTTCGG ACATCTGGCA 360 
CACGGGGCAG AGCAGAGAAG CTCAGCGCCC AGCCTGGGGA ATTTAAACAC TCCAGCTTCC 420 
AAGAGCCAAG GAACTTCAGT GCTGTGAACT CACAACTCTA AGGAGCCCTC CAAAGTTCCA 480 
GTCTCCAGGT GCTGTTACTC AACTCAGTCC TAGGAACGTC GGGTCCTGGG AAGGAGCCCA 540 
AGCGCTCCCA GCCAGCTTCC AGGCGCTAAG AAACCCCGGT GCTTCCCATC ATGGTGGCCG 600 
ATCCTCCTCG AGACTCCAAG GGGCTCGCAG CGGCGGAGCC CACCGCCAAC GGGGGCCTGG 660 
CGCTGGCCTC CATCGAGGAC CAAGGCGCGG CAGCAGGCGG CTACTGCGGT TCCCGGGACC 720 
AGGTGCGCCG CTGCCTTCGA GCCAACCTGC TTGTGCTGCT GACAGTGGTG GCCGTGGTGG 780 
CCGGCGTGGC GCTGGGACTG GGGGTGTCGG GGGCCGGGGG TGCGCTGGCG TTGGGCCCGG 840 
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AGCGCTTGAG CGCCTTCGTC TTCCCGGGCG AGCTGCTGCT GCGTCTGCTG CGGATG ATCA 900 
TCTTGCCGCT GGTGGTGTGC AGdTGATCG GCGGCGCCGC CAGCCTGGAC CCCGGCGCGC 960 
TCGGCCGTCT GGGCGCCTGG GCGCTGCTCT TTTTCCTGGT CACCACGCTG CTGGCGTCGG 1020 
CGCTCGGAOT GGGCTTGGCG CTGGCTCTGC AGCCGGGCGC CGCCTOCGCC GCCATCAACG 1080 
CCTCCGTGGG AGCCGCGGGC AGTGCCGAAA ATGCCCCCAG CAAGGAGGTG CTCGATTCGT 1140 
TCCTGGATCT TGOGAGAAAT ATCTTCCCTT CCAACCTGGT GTCAGCAGCC TTTCGCTCAT 1200 
ACTCTACCAC CTATGA AG AG AGGAATATCA CCGOAAOCAG GGTGAAOGTG OCCGTGGGGC 1260 
AGGAGGTGGA GGGGATG AAC ATCCTGGGCT TGGTAGTGTT TGCCATCGTC TTTGGTGTGG 1320 
CGCTGCGGAA GCTGGGGCCT GAAGGGGAGC TGCTTATCCG CTTCTTCAAC TCCTTCAATG 1380 
AGGCCACCAT GGTTCTGGTC TCCTGG ATCA TGTGGTACGC CCCTGTGGGC ATCATGTTCC 1440 
TGGTGGCTGG CAAGATCGTG GAGATGGAGG ATGTGGGTTT ACTCTTTGCC CGCCITGGCA 1500 
AGTACATTCT GTGCTGCCTG CTGGGTCACG CCATCCATGG GCTCCTGGTA CTGCCCCTCA 1560 
TCTACTTCCT CTTCACCCGC AAAAACCCCT ACCGCTTOCT GTGGGGCATC GTGAOGOOGC 1620 
TGGCCACTGC CTTTGGGAOC TCTTOCAGTT GCGGCACGCT GCCGCTG ATG ATG AAGTGCG 1680 
TGGAGGAGAA TAATGGCGTG GCCAAGCACA TCAGCCGTTT CATCCTGCCC ATCGGCGCCA 1740 
CCGTCAACAT GG ACGGTGCC GCGCTCTTCC AGTGOGTGGC CGCAGTGTTC ATTGCACAGC 1800 
TCAGCCAGCA GTCCTTGGAC TTCGTAA AG A TCATCACCAT CCTGGTCACG GCCACAGCGT 1860 
CCAGCGTGGG GGCAGCGGGC ATCCCTGCTG GAGGTGTCCT CACTCTGGCC ATCATCCTCG 1920 
AAGCAGTCAA CCTCCCGGTC G ACCATATCT (XTTG ATCCT GGCTGTGGAC TGGCTAGTCG 1980 
ACCGGTCCTG TACCGTCCTC AATGTAOAAO GTGACGCTCT GGGGGCAGGA CTCCTCCAAA 2040 
ATTATGTGGA CCGTACGGAG TCGAGAAGCA CAGAGCCTGA GTTGATACAA GTGAAGAGTG 2100 
AGCTGCCCCT GGATCCGCTG CCAGTCCCCA CTGAGGAAGG AAACCCCCTC CTCAAACACT 2160 
ATCGGGGGCC CGCAGGGG AT GCCACGGTCG CCTCTGAGAA GGAATCAGTC ATGI&AACCC 2220 
CGGGAGGGAC CTT0CCTGCC CTGCTGGGGG TGCTCTTTGG ACACTGGATT ATGAGG AATG 2280 
GATAAATGGA TGAGCTAGGG CTCTGGGGGT CTGCCTGCAC ACTCTGGGGA GCCAGGGGCC 2340 
CCAGCACCCT CCAGGACAGG AG ATCTGGGA TGCCTGGCTG CTGGAGTACA TGTGTTCACA 2400 
AGGGTTACTC CTCAAAACOC OCAGTTCTCA CTCATGTCCC CAACTCAAGG CTAGAAAACA 2460 
GCAAGATGGA GAAATAATGT TCTGCTGCGT CCCCACCGTG ACCTGCCTGG CCTCCCCTGT 2520 
CTCAGGGAGC AGGTCACAGG TCACCATGGG GAATTCTAGC CCCCACTGGG GGGATGTTAC 2580 
AACACCATGC TGGTTATTTT GGCGGCTGTA GTTGTGGGGG GATGTOTGTG TGCACGTGTG 2640 
TGTGTGTGTG TGTGTGTGTG TGTGTGTGTG TTCTGTGACC TCCTGTCCCC ATGGTACGTC 2700 
CCAOCCTGTC CCCAGATCCC CTATTCCCTC CACAATAACA GAAACACTCC CAGGG ACTCT 2760 
GGGG AGAGGC TGAGGACAAA TACCTGCTGT CACTCCAGAG GACATTTTTT TTAGCAATAA 2820 
AATTGAGTGT CAACTATTTA AAAAAAAAAA AAAAAA 



SEQ ID K0:118 PFJ4 Protein sequence: 
ftoten Accession*: NPJJ05619.1 

1 11 21 31 41 51 
I I I I I I 

MVADPPRDSK GLAAAEPTAN GGLALASIED QGAAAGGYOG SRDQVRRCLR ANLLVLLTW 60 
AWAGVALGL GVSG AGGALA LG PERLS AFV FPGELLLRLL RMIILPLWC SLIGGAASLD 120 
PGALGRLGAW ALLFFLVTTL LASALGVGLA LALQPGAASA AINASVGAAG SAENAPSKEV 180 
LDSFLDLARN IFPSNLVS AA FRSYSTTYEE RNITGTRVKV PVGQBVEGMN BLGLVYFAIV 240 
FGVALRKLGP EGELURFFN SFNEATMVLV SWIMWYAPVG IMFLVAGHV EMEDVGLLFA 300 
RLGKYILCCL LGHAIHGLLV UPLIYFLFTR KNFYRFLWGI VTPLATAFGT SSSS ATLPLM 360 
MKCVEENNGV AKHISRFDLP IGATVNMDGA ALPQCVAAVF LAQLSQQSLD FVKHTBLVT 420 
ATASSVGAAG IPAGGVLTLA HLEAVNLPV DHBULAVD WLVDRSCTVL NVEGDALGAG 480 
LLQNYVDRTE SRSTEPEUQ VKSEtPLDPL PVPTEEGNPL LKHYRGPAGD ATV ASEKES V 540 
M 



SEQ ID N0:1 19 PFJ3 DNA SEQUENCE 

Nucieic Acid Accession*: NM_006708 

Coding sequence: 8*642 (underlined sequences correspond to start and stop codons) 
1 11 21 31 41 51 

CTAGTTAAGG CGGCACAGGG CCGAGGCGTA GTGTGGGTGA CTTCTCCGTT CCTTGGGTCC 60 
CGTCGTCTGT GATACTGCAG TTCAGCCATG GCAGAACCGC AGCCCCCGTC CGGCGGCCTC 120 
ACGGACGAGG CCGCCCTCAG TTGCTGCTCC GACGCGGACC CCAGTACCAA GGATTTTCTA 180 
TTGCAGCAGA CCATGCTACG AGTGAAGGAT CCTAAGAAGT CACTGGATTT TTATACTAGA 240 
GTTCTTGG AA TG ACGCTAAT CCAAAAATGT GATTTTCCCA TTATG AAGTT TTCACTCTAC 300 
TTCTTGGCTT ATGAGGATAA AAATGACATC CCTAAAGAAA AAGATG AAAA AATAGCCTGG 360 
GCGCTCTCCA GAAAAGCTAC ACTTG AGCTG ACACACAATT GGGGCACTGA AGATGATGOG 420 
ACCCAG AGTT ACCACAATGG CAATTCAG AC CCTCGAGGAT TCGGTCATAT TGGAATTGCT 480 
GTTCCTGATG TATACAGTGC TTGTAAAAGG TTTG AAGAAC TGGGAGTCAA ATTTGTGAAG 540 
AAACCTGATG ATGGTAAAAT G AAAGGCCTG GCATTTATTC AAGATCCTGA TGGCTACTGG 600 
ATTGAAATTT TGAATCCTAA CAAAATGGCA ACCTTAATGT AGTGCTGTGA GAATTCTCCT 660 
TTGAGATTTC AGAAGAAAGG AAACAATGTG ATTCAAGATA TTTACATACC AGAAGCATCT 720 
AGGACTG ATG GATCACTGTC CCGATTCAAA TTATTCTTCA GTCCATTTCC CCTTOCTATT 780 
TCAGCTGTTC CTTTTCACCT AACTGTTCAG TCATTCTGGT TTTCAAGCAG TGCTTTATCT 840 
CATGTCCTTG AATATAGTTG TGTAACTTTA TTTTTTAGGT AATAATTAGA ACAGTTCCCT 900 
TCAGAGGCTG CATTTGCCTT CTTOGCCAC CTAAATATTA t ' l IUX1 1 CA AATCTGCCTT 960 
TG AATCATCA TTTTTAAAAA AAAATTAACA TGTTTTTGTT GTAGTTATCT TCTGGGGTTT 1020 
CAATTCCTCA GAAACAACTT TTTTCACAAC GGAAAGGAAA GAACACTAGT GTTCTTTCAG 1080 
TAAAGTACAA AGTGTTTATT TTACAAAAG A GTAGGTACTC TTGAGAGCAA TTCAAATCAT 1140 
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GCTGACAAGG ATACTGATAG AAAAAGTGAT TTCTTCTTAT TATAAAGTAC ATTTAAAGTT 1200 
CAAGGACTAA CCTTATTTAT TTGGG AAAGG GGAGGAGGAA GGAAATG ATA TGGTAGOCAG 1260 
ACACTGGGCT AGGCTGCAAC TTTATCTCAT TTAATACTOC CAGCTGTCAT GTGAGAAAG A 1320 
AAGCAGGCTA GGCATGTGAA ATCACTTTCA TGOATTATTA ATGGATTTAA GAGGGCATCA 1380 
ATCAGCTCAA CTCAAGATTT CATAATCATT TTTAGTATTT AGATTGTGCC TCAAAGTTGT 1440 
AGTACCTCAC AATACCTCCA CTGGTTTOCT GTTGTAAAAA CCTTCAGTGA GTTTGACCAT 1500 
TGTGCTCTTG GCTCTTGGGC TGGAGTACCG TGGTGAGGGA GTAAACACTA G AAGTCTTTA 1560 
GTACAAAACT GCTCTAGGGA CACCTGGTGA TTCCTACACA AGTGATGTTT ATATTTCTCA 1620 
JAAAGAGTCT TCCCTATCCC AAGGTCTTCA TGATGCCAGT AGCCATATAT GATAAATTAT 1680 
GTTCAGTGAT AACTTAGTTA TCAOAAATCA GCTCAGTCGT CTTCCCCGCC ATOATTCACA 1740 
TTTGATGAGT TTTTAAAAAT CAAAGTGATT TTGAAAATCT CTAATGGCTC AGAAAATAAA 1800 
AACATOCAGT TTGTGGATGA CTATATTTAG ATTTCTCTAG ACTCTAGTGG AAG AOCTTTG 1860 
GAAAGGCCAT GCCAACCGTG CTTGTACTGC TAGAAGCACT TTATGTTTCC TTTTTGGGTG 1920 
AAATGGATTT ATGTGAGTGC TTTAAACAAA TAGCAATACTTATAGACTGA AATAAAATGA 1980 
AACTTCAAAT AAG 



SEQ ID Nft120 PFJ3Pro1etn sequence 
Protein Accession* NPJJ06S99.1 

I II 21 31 41 51 
I I I I I I 

MAEPQPPSGG LTDEAALSCC SDADPSTKDF LLQQTMLRVK DPKKSLDFYT RVLGMTUQK 60 
CDFPIMKFSL YFLAYEDKND IPKEKDEKIA WALSRKATLE LTHNWGTEDD ATQS YHNGNS 120 
DPRGFGHIGI AVPDVYSACK RFEELGVKFV KKPDDGKMKG LAHQDPDGY WIEILNPNKM 180 



SEQIONO:121 PFJ2DNA SEQUENCE 

Nucleic Add Accession*; NMJW2867 

Coding sequence: 70-729 (underlined sequences correspond to start end stop colons) 

1 11 21 31 41 51 
I I I I I I 

CCGACGCCAG GTCCTGCCGT CCCGCCGACC GTCCGGGAGC GAACCCGTCG TCCCGCACTG 60 
GAGTCCGCGAJTSGCTTCAGT OACAGATGOT AAACATGGAG TCAAAGATGC CTCTGACCAG 120 
AATTTTGACT ACATGTTTAA ACTGCTTATC ATTGGCAACA GCAGTGTTGG CAAGACCTCC 180 
TTCCTCTTGC GCTATGCTGA TGACACGTTC ACCCCAGCCT TCGTTAGCAC CGTGGGCATC 240 
GACTTCAAGG TGAAGACAGT CTACCGTCAC GAGAAGOGGG TGAAACTGCA GATCTGGGAC 300 
ACAGCTGGGC AGGAGCGGTA CCGGACCATC ACAACAGCCT ATTACCGTGG GGCCATGGGC 360 
TTCATTCTGA TGTATGACAT CACCAATG AA GAGTCCTTCA ATGCTGTCCA AGACTGGGCT 420 
ACTCAGATCA AGACCTACTC CTGGG ACAAT GCACAAGTTA TTCTGGTGGG GAACAAGTGT 480 
GA CATGO AG G AAGAG AGGGT TGTTCCCACT GAGAAGOGCC AGCTCCTTGC AGAGCAGCTT 540 
GGGTTTGATT TCTTTGAAGC CAGTGCAAAG GAGAACATCA GTGTAAGGCA GGCCTTTGAG 600 
CGCCTGGTGG ATGCCATTTG TGACAAGATG TCTGATTCGC TGGACACAGA CCCGTCGATG 660 
CTGGGCTCCT CCAAGAACAC GCGTCTCTCG GACAOCOCAC CGCTGCTGCA GCAGAACTGC 720 
TCATGdAgC AAGGCCCACC TTCCTGACCT CCCCTCATTG TGGOCOCACA CCCAAGTCTG 780 
CTTCTCCCTG TTACACACTG TCCGCTCT 

SEQ ID K0:122 PFJ2 Protein seouence 
Protein Accession #: NP.002658.1 

1 11 21 31 41 51 
I I I I I I 

MASVTDGKHG VKDASDQNFD YMFKLLUGN SSVGKTSHX RYADDTFTPA FVSTVGIDFK 60 
VKTVYRHEKR VKLQIWDTAG QERYRTITTA YYRG AMGFIL MYDITNEESF NAVQDWATQI 120 
KTYSWDNAQV ILVGNKCDME EER WPTEKG QLLAEQLGFD FFEAS AKENI SVRQAFERLV 180 
DAICDKMSDS LDTDPSMLGS SKNTRLSDTP PLLQQNCSC 



SEQ ID NO:123 PFJ1 DNA SEQUENCE 

Nudeic Add Accession #: KM 001844 

Coding sequence: 1584621 (umieribed sequerces cofrespond to start and stop codons) 

1 1 1 21 31 41 51 
I I I I I I 

ACGCAGAGCG CTGCTGGGCT GCCGGGTCTC CCGCTTCCTC CTCCTGCTCC AAGGGCCTOC 60 
TGCATGAGGG CGCGGTAGAG ACCCGGACCC GCGCCGTGCT OCTGCOGTTT CGCTGCGCTC 120 
CGCCCGGGCC CGGCTCAGCC AGGCCCCGCG GTGAGCCAJS ATTCGCCTCG GGGCTCCCCA 180 
GTCGCTGGTG CTGCTGAOGC TGCTCGTCGC CGCTGTCCTT CGGTGTCAGG GOCAGGATCT 240 
CCAGGAGGCT GGCAGCTGTG TGCAGGATGG GCAGAGGTAT AATGATAAGG ATGTGTGGAA 300 
GGCGGAGOCX TGCCGGATCT GTGTCTGTG A CACTGGGACT GTCCTCTGOG ACG ACATAAT 360 
CTGTGAAGAC GTGAAAGACT GCCTCAGCCX TGAGATCCCC TTCGGAGAGT GCTCOCOCAT 420 
CTGOXAACT GACCTCGCCA CTGCCAGTGG GCAACCAGGA CCAAAGGGAC AGAAAGGAGA 480 
ACXTKK3 AG AC ATCAAGG ATA TTGTAGGACC CAAAGGACCT CCTGGGCCTC AGGGAOCTGC 540 
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AGGGGAACAA GGACCCAGAG GGGATCGTGG TGACAAAGGT GAAAAAGGTG CCCCTGGACC 600 
TCGTGGCAGA GATGGAGAAC CTGGGACCCC TGGAAATCCT GGCCCCCCTG GTCCTCCCGG 660 
CCCCCCTGGT CCCCCTGGTC TTGGTGGAAA CTTTGCTGCC CAGATGGCTG GAGGATTTGA 720 
TGAAAAGGCT GGTGGCGCCC AGTTGGGAGT AATGCAAGGA CCAATGGGCC CCATGGGACC 780 
TCGAGGACCT CCAGGCCCTG CAGGTGCTCC TGGGCCTCAA GGATTTCAAG GCAATCCTGG 840 
TGAACCTGGT OAACCTGGTG TCTCTGGTCC CATGGGTCCC CGTGGTCCTC CTGGTCCCCC 900 
TGGAAAGCCT GGTGATGATG GTGAAGCTGG AAAACCTGGA AAAGCTGGTG AAAGQGGTCC 960 
GCCTGGTCCT CAGGGTGCTC GTGGTTTCCC AGGAACCCCA GGCCTTCCTG GTGTCAAAGG 1020 
TCACAGAGGT TATCCAGGCC TGGACGGTGC TAAGGGAGAG GCGGGTGCTC CTGGTGTGAA 1080 
GGGTGAGAGT GGTTCCCCGG GTGAGAACGG ATCTCCGGGC CCAATGGGTC CTCGTGGOCT 1 140 
GCCTGGTGAA AGAGGACGGA CTGGCCCTGC TGGCGCTGCG GGTGCCCGAG GCAACGATGG 1200 
TCAGCCAGGC CCCGCAGGTC CTCCGGGTCC TGTCGGTCCT GCTGGTGGTC CTGGCTTCCC 1260 
TGGTGCTCCT GGAGCCAAGG GTGAAGCCGG CCCCACTGGT GCCCGTGGTC CTGAAGGTGC 1320 
TCAAGGTCCT CGCGGTGAAC CTGGTACTCC TGGGTCCCCT GGGCCTGCTG GTGCCTCCGG 1380 
TAACCCTGGA ACAGATGGAA TTCCTGGAGC CAAAGGATCT GCTGGTGCTC CTGGCATTGC 1440 
TGGTGCTCCT GGCTTCCCTG GGCCACGGGG TCCTCCTGGC CCTCAAGGTG CAACTGGTCC 1500 
TCTGGGCCCG AAAGGTCAGA CGGGTGAACC TGGTATTGCT GGCTTCAAAG GTGAAC AAGG 1560 
CCCCAAGGGA GAACCTGGCC CTGCTGGCCC CCAGGGAGCC CCTGGACCCG CTGGTGAAGA 1620 
AGGCAAGAGA GGTGCCCGTG GAGAGCCTGG TGGCGTTGGG CCCATCGGTC CCCCTGGAGA 1680 
AAGAGGTGCT CCCGGAAACC GCGGTTTOCC AGGTCAAGAT GGTCTGGCAG GTCCCAAGGG 1740 
AGCCCCTGGA GAGCGAGGGC CCAGTGGTCT TGCTGGCCCC AAGGGAGCCA ACGGTGACCC 1800 
TGGCCGTCCT GGAGAACCTG GCCTTCCTGG AGCCCGGGGT CTCACTGGCC GCCCTGGTGA 1860 
TGCTGGTCCT CAAGGCAAAG TTGGCCCTTC TGGAGCCCCT GGTGAAGATG GTCGTCCTGG 1920 
ACCTCCAGGT CCTCAGGGGG CTCGTGGGCA GCCTGGTGTC ATGGGTTTCC CTGGCCCCAA 1980 
AGGTGCCAAC GGTGAGCCTG GCAAAGCTGG TGAGAAGGGA CTGCCTGGTG CTCCTGGTCT 2040 
GAGGGGTCTT CCTGGCAAAG ATGGTGAGAC AGGTGCTGCA GGACCCCCTG GCCCTGCTGG 2100 
ACCTGCTGGT GAACGAGGCG AGCAGGGTGC TCCTGGGCCA TCTGGGTTCC AGGGACTTCC 2160 
TGGCCCTCCT GGTCCCCCAG GTGAAGGTGG AAAACCAGGT GACCAGGGTG TTCCCGGTGA 2220 
AGCTGGAGCC CCTGGCCTCG TGGGTCCCAG GGGTGAACGA GGTTTCCC AG GTG AACGTGG 2280 
CTCTCCCGGT GCCCAGGGCC TCCAGGGTCC CCGTGGCCTC CCCGGCACTC CTGGCACTGA 2340 
TGGTCCCAAA GGTGCATCTG GCCCAGCAGG CCCCCCTGGC GCACAGGGCC CTCCAGGTCT 2400 
TCAGGGAATG CCTGGCGAGA GGGGAGCAGC TGGTATCGCT GGGCCCAAAG GCGACAGGGG 2460 
TGACGTTGGT GAGAAAGGCC CTGAGGGAGC CCCTGGAAAG GATGGTGGAC GAGGCCTGAC 2520 
AGGTCCCATT GGCCCCCCTG GCCCAGCTGG TGCTAACGGC GAGAAGGGAG AAGTTGGACC 2580 
TCCTGGTCCT GCAGGAAGTG CTGGTGCTCG TGGCGCTCCG GGTGAACGTG GAGAGACTGG 2640 
CCCCCCCGGA CCAGCGGGAT TTGCTGGGCC TCCTGGTGCT GATGGCCAGC CTGGGGCCAA 2700 
GGGTGAGCAA GGAG AGGCCG GCCAGAAAGG CGATGCTGGT GCCCCTGGTC CTCAGGGCCC 2760 
CTCTGGAGCA CCTGGGCCTC AGGGTCCTAC TGGAGTGACT GGTCCTAAAG GAGCCCGAGG 2820 
TGCCCAAGGC CCCCCGGGAG CCACTGGATT CCCTGGAGCT GCTGGCCGCG TTGG ACCCCC 2880 
AGGCTCCAAT GGCAACCCTG GACCCCCTGG TCCCCCTGGT CCITCTGGAA AAGATGGTCC 2940 
CAAAGGTGCT CGAGGAGACA GCGGCCCCCC TGGCCGAGCT GGTGAACCCG GCCTCCAAGG 3000 
TCCTGCTGGA CCCCCTGGCG AGAAGGGAGA GCCTGGAGAT GACGGTCCCT CTGGTGCCGA 3060 
AGGTCCACCA GGTCCCCAGG GTCTGGCTGG TCAGAGAGGC ATCGTCGGTC TGCCTGGGCA 3120 
AOGTGGTGAG AGAGGATTCC CTGGCTTGCC TGGCCCATCG GGTGAGCCCG GCAAGCAGGG 3180 
TGCTCCTGGA GCATCTGGAG ACAGAGGTCC TCCTGGCCCC GTGGGTCCTC CTGGCCTG AC 3240 
GGGTCCTGCA GGTGAACCCG GACGAGAGGG AAGCCCCGGT GCTGATGGCC CCCCTGGCAG 3300 
AGATGGCGCT GCTGGAGTCA AGGGTGATCG TGGTGAGACT GGTGCTGTGG GAGCTCCTGG 3360 
AGCCCCTGGG CCCCCTGGCT CCCCTGGCCC CGCTGGTCCA ACTGGCAAGC AAGGAGACAG 3420 
AGG AG AAGCT GGTGC ACAAG GCCCC ATGGG ACCCTCAGGA CCAGCTGGAG CCCGGGG AAT 3480 
CCAGGGTCCT CAAGGOCCCA GAGGTGACAA AGGAGAGGCT GGAGAGCCTG GCGAGAGAGG 3540 
CCTGAAGGGA CACCGTGGCT TCACTGGTCT GCAGGGTCTG CCCGGCCCTC CTGGTCCTTC 3600 
TGGAGACCAA GGTGC1 1C1G GTCCTGCTGG TCCTTCTGGC CCTAGAGGTC CTCCTGGCCC 3660 
CGTCGGTCCC TCTGGCAAAG ATGGTGCTAA TGG AATCCCT GGCCCCATTG GGCCTCCTGG 3720 
TCCCCGTGGA CGATCAGGCG AAACCGGTCC TGCTGGTCCT CCTGGAAATC CTGGGCCCCC 3780 
TGGTCCTCCA GGTCCCCCTG GCCCTGGCAT CGACATGTCC GCCTTTGCTG GCTTAGGCCC 3840 
GAGAGAGAAG GGCCCCGACC CCCTGCAGTA CATGCGGGCC GACCAGGCAG CCGGTGGCCT 3900 
GAGACAGCAT GACGCCGAGG TGGATGCCAC ACTCAAGTCC CTCAACAACC AGATTGAGAG 3960 
CATCCGCAGC CCCGAGGGCT CCCGCAAGAA CCCTGCTCGC ACCTGCAGAG ACCTGAAACT 4020 
CTGCCACCCT GAGTGGAAGA GTGGAGACTA CTGGATTGAC CCCAACCAAG GCTGCACCTT 4080 
GGACGCCATG AAGGTTTTCT GCAACATGGA GACTGGCGAG ACTTGCGTCT ACCCCAATCC 4140 
AGCAAACGTT CCCAAGAAG A ACTGGTGGAG CAGCAAGAGC AAGGAGAAGA AACACATCTG 4200 
GTTTGGAGAA ACCATCAATG GTGGCTTCCA TTTCAGCTAT GGAGATGACA ATCTGGCTCC 4260 
CAACACTGCC AACGTCCAGA TGACCTTCCT ACGCCTGCTG TCCACGGAAG GCTCCCAGAA 4320 
CATCACCTAC CACTGCAAGA ACAGCATTGC CTATCTGGAC GAAGCAGCTG GCAACCTCAA 4380 
GAAGGCCCTG CTCATCCAGG GCTCCAATGA CGTGGAG ATC CGGGCAGAGG GCAATAGCAG 4440 
GTTCACGTAC ACTGCCCTGA AGGATGGCTG CACGAAACAT ACCGGTAAGT GGGGCAAG AC 4500 
TGTTATCGAG TACCGGTCAC AGAAGACCTC ACGCCTCCCC ATCATTGACA TTGCACCCAT 4560 
GGACATAGGA GGGCCCGAGC AGGAATTCGG TGTGGACATA GGGCCGGTCT GCTTCTTGTA 4620 
AAAACCTGAA CCCAGAAACA ACACAATCCG TTGCAAACCC AAAGGACCCA AGTACTTTCC 4680 
AATCTCAGTC ACTCTAGGAC TCTGCACTGA ATGGCTG ACC TGACCTGATG TCCATTCATC 4740 
CCACCCTCTC ACAGTTCGGA CTTTTCTCCC CTCTCTTTCT AAGAGACCTG AACTGGGCAG 4800 
ACTGCAAAAT AAAATCTCGG TGTTCTATTT ATTTATTGTC TTCCTGTAAG ACCTTCGGGT 4860 
CAAGGCAGAG GCAGGAAACT AACTGGTGTG AGTCAAATGC CCCCTGAGTG ACTGCCCCCA 4920 
GCCCAGGCCA GAAGACCTCC CTTCAGGTGC CGGGCGCAGG AACTGTGTGT GTCCTACACA 4980 
ATGGTGCTAT TCTGTGTCAA ACACCTCTGT ATTTTTTAAA ACATCAATTG ATATTAAAAA 5040 
TGAAAAGATT ATTGGAAAGT 



SEQ ID Mfr124 PFJ1Protdn sequence; 
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Protein Accession f: NP.0018&2 

1 II 21 31 41 51 
1 I I I I I 

MKLGAPQSL VLLTLLVAA V LRCQGQDVQE AGSCVQDGQR YNDKDVWKPE PCRICVCDTG 60 
TVLCDDHCE DVKDCLSPEI PFGECCPICP TDLATASGQP GPKGQKGEPG DIKDIVGPKG 120 
PPGPQGPAGB QGPRGDRGDK OEKGAPGPRG RDGEPGTPGN PGPPGPPGPP GPPGLGGNFA 180 
AQMAGGFDEK AGGAQLGVMQ GPMGPMGPRG PPGPAGAPGP QGFQGNPGEP GEPGVSGPMG 240 
PRGPPGPPGK PGDDGEAGKP GKAGERGPPG PQG ARGFPGT PGLPGVKGHR GYPGLDGAKG 300 
EAGAPGVKGE SGSPGENGSP GPMGPRGLPG ERGRTQPAGA AGARGNDGQP GPAGPPGPVG 360 
PAGGPGFPGA PGAKGEAGPT GARGPEGAQG PRGEPGTPGS PGPAGASGNP GTDGIPGAXG 420 
SAGAPGIAGA PGFPGTRGPP GPQGATGPLG PKGQTGEPGI AGFKGEQGPK GEPGPAGPQG 480 
APGPAGEEGK RGARGEPGGV GPIGPPGERG APGNRGFPGQ DGLAGPKGAP GERGPSGLAG 540 
PKGANGDPGR PGEPGLPGAR GLTGRPGDAG PQGKVGPSGA PGEDGRPGPP GPQGARGQPG 600 
VMGFPGPKGA NGEPGKAGEK GLPGAPGLRG LPGKDGETGA AGPPOPAGPA GERGEQGAPG 660 
PSGFQGLPGP PGPPGEGGKP GDQGVPGEAG APGLVGPRGE RGFPGERGSP GAQGLQGPRG 720 
LPGTPGTDGP KGASGPAGPP GAQGPPGLQG MPGERGAAGI AGPKGDRGDV GEKGPEGAPG 780 
KDGGRGLTGP IGPPGPAGAN GEKGEVGPPG PAGS AG ARGA PGERGETGPP GPAGFAGPPG 840 
ADGQPGAKGE QGEAGQKGDA G APGPQGPSG APGPQGPTGV TGPKGARGAQ GPPGATGFPG 900 
AAGRVGPPGS NGNPGPPGPP GPSGKDGPKG ARGDSGPPGR AGEPGLQGPA GPPGEKGEPG 960 
DDGPSGAEGP PGPQGLAGQR GJVGLPGQRG ERGEPGLPGP SGEPGKQGAP GASGDRGPPG 1020 
PVGPPGLTGP AGEPGREGSP GADGPPGRDG AAGVKGDRGE TGAVGAPGAP GPPGSPGPAG 1080 
PTGKQGDRGE AGAQGPMGPS GPAGARGJQG PQGPRGDKGE AGEPGERGLK GHRGFTGLQG 1140 
UGPPGPSGD QGASGPAGPS GPRGPPGPVG PSGKDGANGI PGPIGPPGPR GRSGETGPAG 1200 
PPGNPGPPGP PGPPGPGIDM S AFAGLGPRE KGPDPLQYMR ADQAAGGLRQ HDAEVDATLK 1260 
SLNNQEESIR SPEGSRKNPA RTCRDLKLCH PEWKSGDYWI DPNQGCTLDA MKVFCNMETG 1320 
ETCVYPNPAN VPKKNWWSSK SKEKKHIWFG ETINGGFHFS YGDDNLAPNT ANVQMTFLRL 1380 
LSTEGSQNIT YHCKNSIAYL DEAAGNLKKA LLIQGSNDVE IRAEGNSRFT YTALKDGCTK 1440 
HTGKWGKTVI EYRSQKTSRL PDDIAPMD1 GGPEQEFGVD IGPVCFL 



8EQ (D N0:12S PFH9 DMA SEQUENCE 

Nucleic Add Accession*: NMJ05084 

Coding sequence: 162-1 487(undertlned sequences correspond to slajt and stop codcns) 
1 11 21 31 41 51 

GCTGGTCGGA GGCTCGCAGT GCTGTCGGCG AGAAGCAGTC GGGTTTGGAG CGCTTGGGTC 60 
GOGTTGGTGC GCGGTGGAAC GCGOCCAGGG AOCCCAGTTC OOGCGAGCAG CTOCGCOCCC 120 
CGCCTGAGAG ACTAAGCTGA AACTGCTGCT CAGCTCCCAA GATGGTGCCA CCCAAATTGC 180 
ATGTGCmT CTGCCTCTGC GGCTGCCTGG CTGTGGTTTA TCCTTTTGAC TGGCAATACA 240 
TAAATCCTGT TGCCCATATG AAATCATCAG CATGGGTCAA CAAAATACAA GTACTGATGG 300 
CTGCTGCAAG CTTTGGCCAA ACTAAAATCC CCCGGGG AAA TGGGCCTTAT TGCGTTGGTT 360 
GTACAGACTT AATGTTTGAT CACACTAATA AGGGCACCTT CTTGCGTTTA TATTATOCAT 420 
OCCAAG ATAA TGATCGCCTT GACACCCTTT GGATCGCAAA TAAAG AATAT TnTGGGGTC 480 
TTAGCAAATT TCTTGGAACA CACTGGCTTA TGGGCAACAT TTTG AGGTTA CTCTTTGGTT 540 
CAATG ACAAC TCCTGCAAAC TGGAATTCCC CTCTGAGGCC TGGTGAAAAA TATOCACTTG 600 
TTGTTTTTTC TCATGGTCTT GGGGCATTCA GGACACTTTA TTCTGCTATT GGCATIGACC 660 
TGGCATCTCA TGGGTTTATA GTTGCTGCTG TAGAACACAG AGATAGATCT GCATCTGCAA 720 
CTTACTATTT CAAGG ACCAA TCTGCTGCAG AAATAGGGGA CAAGTCTTGG CTCTACCTTA 780 
GAACCCTGAA ACAAG AGGAG GAG ACACATA TACGAAATGA GCAGGTACGG CAAAGAGCAA 840 
AAGAATGTTC CCAAGCTCTC AGTCTGATTC TTGACATTGA TCATGGAAAG CCAGTGAAGA 900 
ATGCATTAGA TTTAAAGTTT GATATGGAAC AACTGAAGGA CTCTATTGAT AGGGAAAAAA 960 
TAGCAGTAAT TCGACATTCT TTTGGTGGAG CAACGGTTAT TCAGACTCTT AGTGAAGATC 1020 
AGAGATTCAG ATGTGGTATT GCCCTGGATG CATGGATGTT TCCACTGGGT GATGAAGTAT 1080 
ATTCCAGAAT TCCTCAGCCC CTCTTTTTTA TCAACTCTOA ATATTTCCAA TATCCTGCTA 1140 
ATATCATAAA AATGAAAAAA TGCTACTCAC CT GATAA AGA AAGAAAGATG ATTACAATCA 1200 
GGGGTTCAGT CCACCAGAAT TTTGCTGACT TCACTTTTGC AACTGGCAAA ATAATTGGAC 1260 
ACATGCTCAA ATTAAAGGGA GACATAGATT CAAATGTAGC TATTGATCTT AGCAACAAAG 1320 
CTTCATTAGC ATTCTTACAA AAGCATTTAG GACTTCATAA AGATTTTGAT CAGTGGGACT 1380 
GCTTGATTGA AGGAGATGAT GAGAATCTTA TTCCAGGGAC CAACATTAAC ACAACCAATC 1440 
AACAC ATCAT GTTACAGAAC TCTTCAGGAA TAGAGAAATA CAATJAGGAT TAAAATAGGT 1500 
TTTTT 



1 11 21 31 41 51 

MVPPKLH VLF CLCGCLAVVY PFDWQYINPV AHMKSS A WVN KIQVLMAAAS FGQTKIPRGN 60 
GPYSVGCTDL MFDHTNKGTF LRLYYFSQDN DRLDTLWIPN KEYFWGLSKF LGTHWLMGNI 120 
LR1XFGSMTT PANWNSPLRPGEKYPLWFS HGLGAFRTLY SAIG1DLASH GFTVAAVEHR 180 
DRSASATYYF KDQSAAEJGD KSWLYLRTLK QEEETWRNE QVRQRAKEGS QALSLXLDID 240 
HGKPVKNALD LKFDMEQLKD SIDREKIAVI GHSFGGATVI QTLSEDQRFR CGIALDAWMF 300 
PLGDEVYSRI PQPLFFINSE YFQYPANUK MKKCYSPDKE RKMITIRGSV HQNFADFTFA 360 
TGKUGHMLK LKGDIDSNVA IDLSNKASLA FLQKHLGLHK DFDQWDCUE GDDENLIPGT 420 
NINTINQHIM LQNSSGIEKY N 
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SEQ ID NO:127 PFH8 DNA SEQUENCE 

Nucleic Add Accession* NM.01S900 

Coding sequence: 32-1402 (underlined sequences correspond to start and slop codcns) 

1 11 21 31 41 51 
I I I I I I 

CACGAGCGGC ACOAGGATTT CCAGCTCAGC GAJ£CCCCCA GGTCCCTGGG AOAGCTGCTT 60 
CTGGGTGGGG GGCCTCATTT TGTGGCTCAG CGTTGGAAGT TCAGGGGATG CACCTCCTAC 120 
CCCACAGCCA AAGTGCGCTG ACTTCCAGAG CGCCAACCTT TTTGAAGGCA CCGATCTCAA 180 
AGTCCAGTTT CTCCTCTITG TCCCTTCGAA TCCTAGCTGT GGGCAGCTAG TAGAAGGAAG 240 
CAGTGACCTC CAAAACTCTG GGTTCAATGC CACTCTGGGA ACCAAACTAA 1TATCCATGG 300 
ATTCAGGGTT TTAGGAACAA AGCCTTCCTG GATTGACACA TTTATTAGAA CCCTTCTGOG 360 
TGCAACGAAT GCTAATGTGA TTGCCGTGGA CTGGATTTAT GGGTCTACAG GAGTCTACTT 420 
CTCAGCTGTG AAAAATGTG A TT AAGTTG AG CCTCG AG ATC TCCCTTTTCC TCAATAAACT 480 
CCTGGTGCTG GGTGTGTCGG AATCCTCAAT CCACATCATT GGTGTTAGCC TGGGGGCCCA 540 
CGTTGGGGGC ATGGTGGGAC AGCTCTTCGG AGGOCAGCTG GGACAGATCA CAGGCCTGGA 600 
CCCCGCTGGA CCTG AGTACA CCAGGGCCAG TGTGGAAGAG CGCTTGGATG CTGGAG ATGC 660 
CCTCTTCGTG GAAGCCATCC ACACAGACAC CGACAATTTG GGTATTCGGA TTCCCGTTGG 720 
ACATGTGGAC TACTTCGTCA ACGGAGGCCA AGACCAACCT GGCTGCCCCA CCTTCTTTTA 780 
CGCAGGTTAT AGTTATCTGA TCTGTOATCA CATGAGGGCT GTGCACCTCT ACATCAGCGC 840 
CCTGGAGAAT TCCTGTCCAC TGATGGCCTT TCCCTGTGCC AGCTACAAGG OCTTCCITGC 900 
TGGACGCTGT CTGGATTGCT TTAACCCTTT TCTGCTTTCC TGCCCAAGGA TAGGACTGGT 960 
GGAACAAGGT GGTGTCAAGA TAGAGCCGCT CCCCAAGGAA GTGAAAGTCT AOCTCCTGAC 1020 
TACTTCCAGT GCTCCGTACT GCATGCATCA CAGCCTCGTG GAGTTTCACT TG AAGG AACT 1080 
GAGAAACAAG GACACCAACA TCGAGGTTAC CTTCCTTAGC AGTAACATCA CCTCTTCATC 1140 
TAAGATCACC ATAOCTAAGC AGCAACGCTA TGGGAAAGOA ATCATAGCOC ATGOCACOCC 1200 
ACAATGCCAG ATAAACCAAG TGAAATTCAA GTTTCAGTCT TCCAACCGAG TTTGGAAAAA 1260 
AGACCGGACT ACCATTATTG GGAAGTTCTG CACTGCCCTT TTGCCTGTCA ATGACAGAGA 1320 
AAAGATGGTC TGCTTACCTG AACCAGTGAA CTTACAAGCA AGTGTGACTG TTTCCTGTGA 1380 
OCTGAAGATA GCCTGTGT GT AGTTTAACCT GGGCAGGACA CATCTCCCTG CATriTITlT 1440 

inn run gagagagagg tgtc atgagg gatgtgtgtg tgcagcttat tgtagaccat 1500 

TACTACTAAG GAGAAAAGCA AAGCTCTTTC TTATTTTCCT CATAATCAGC TACCCTGGAG 1560 
GGGAGGGAGA ACTCAnTTA CAGAACTTGG TTTCCTTTGC CGATCTTATG TACATACCCA 1620 
TTTTAGCTTT CCCATGCATA CTTAACTGCA CTTGCTTTAT CTCCTIGGGC ATTCGTACTT 1680 
AGG ATTCAAT AG AAACATGT ACAGGGTAAA CAATTTTTTA AAAATAAAAC TTCATGGAGT 1740 
AAAAAAAAAA AAAAAAAA 



seq id NO:i28PFHg Proton sgquenffl 
Protein Accession #: NP_056984.1 

1 11 21 31 41 51 
I I I I I I 

MPPGPWESCF WVGGULWLS VGSSGDAPPT PQPKCADPQS ANLFEGTDLK VQFLLFVPSN 60 
PSCGQLVEGS SDLQNSGFNA TLGTKLUHG FRVLGTKPSW IDTFIRTLLR ATNANVIAVD 120 
WIYGSTGVYF S A VKNVIKLS LEISLFLNKL LVLGVSESSI HIIGVSLGAH VGGMVGQLFG 180 
GQLGQITGLD PAGPEYTRAS VEERLDAGDA LFVEAIHTDT DNLGHUPVG HVDYFVNGGQ 240 
DQPGCPTFFY AGYSYliCDH MRAVHLYISA LENSCPLMAF PCAS YKAFLA GRCLDCFNPF 300 
LLSCPRIGLV EQGGVKEPL PKEVKVYLLT TSSAPYCMHH SLVEFHLKEL RNKDTNIEVT 360 
FLSSNTTSSS KTTIPKQQR Y GKG EAHATP QCQINQVKFK FQSSNR VWKK DRTT1IGKFC 420 
TALLPVNDRE KMVCLPEPVN LQASVTVSCD LKIACV 



SEQ ID NCh129 PFH7 DNA SEQUENCE 

Nudete Add Accession f: NMJH4384 

Coding sequence: 89-1 336 (underiined sequences correspond to start and stopcodons) 



1 11 21 31 41 51 
I I I I I I 

CGTTGCCGGG TCGCAGGTCC CGCCAGTGCG AGCGCAACGG AGGTCGAAGG CGTTCAGACT 60 
CTTAGCTGAA CGCGGAGCTG CCGCGGC TAT G CTGTGGAGC GGCTGCCGGC GTTTCGGGGC 120 
GCGCCTCGGC TGCCTGCCCG GCGGTCTCCG GGTCCTCGTC CAGACCGGCC ACCGGAGCTT 180 
GACXTTCCTGC ATCGACCCTT CCATGGGACT TAATGAAGAG CAGAAAGAAT TTCAAAAAGT 240 
GGCCTTTGAC TTTGCTGCCC GAGAGATGGC TCCAAATATG GCAGAGTGGG ACCAGAAGGA 300 
GCTGTTCCCA GTGGATGTGA TGCGGAAGGC AGCCCAGCTA GGCTTCGGAG GGGTCTACAT 360 
ACAAACAGAT GTGGGCGGGT CTGGGCTGTC ACGTCTTGAT ACCTCTGTCA TTTTTGAAGC 420 
CTTGGCTACA GGCTGCACCA GCACCACAGC CTATATAAGC ATOCACAACA 7GTGTGCCTG 480 
GATGATTGAT AGCTTCGGAA ATGAGGAACA GAGGCACAAA TTTTGCCCAC CGCTCTGTAC 540 
CATGOAGAAG TTTGCTTCCT ACTGOCTCAC TGAACCAGOA AGTGGGAGTG ATGCTGCCTC 600 
TCTTCTGACC TCCGCTAAGA AACAGGGAGA TCATTACATC CTCAATGGCT CCAAGGCCTT 660 
CATCAGTGGT GCTGGTGAGT CAGACATCTA TGTGGTCATG TGCCGAACAG GAGGACCAGG 720 
CCCCAAGGGC ATCTCATGCA TAGTTGTTGA GAAGGGGACC (XTGGCCTCA GCTTTGGCAA 780 
GAAGGAG AAA AAGGTGGGGT GGAACTCCCA GCCAACACGA GCIGTGATCT TCG AAGACTG 840 
TGCTGTCCCT GTGGCCAACA G AATTGGG AG CG AGGGGCAG GGCTTCCTCA TTGCCGTGAG 900 
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AGGACTGAAC GG AG GO AGO A TCAATATTGC TTCCTGCTCC CTGGGGGCTG CCCACGCCTC 960 
TGTCATCCTC ACOCGAGACC ACCTCAATGT CCGGAAGCAG TTTGGAGAGC CTCTGGCCAG 1020 
TAACCAGTAC TTGCAATTCA CACTGGCTGA TATGGCAACA AGGCTGGTGG CCGCGCGGCT 1080 
GATGGTCCGC AATGCAGCAG TGGCTCTGCA GOAGGAOAGG AAGGATGCAG TGGCCTIGTG 1140 
CTCCATGGCC AAGCTCTTTG CTACAGATGA ATGCTTTGCC ATCTGCAACC AGGCCTTGCA 1200 
GATGCACGGG GGCTACGGCT AOCTGAAGGA TTACGCTGTT CAGCAGTACG TGCGGGACTC 1260 
CAGGGTCCAC CAGATTCTAG AAGGTAGCAA TOAAGTGATG AGGATACTGA TCTCTAGAAG 1320 
CCTGCT7CAG OAGTAGAACC CACACTTGTT CTGGCCTGGT GTTCAGTGCG ACTGCAGTCA 1380 
GTGTTGAGTG GTGCCATGTG GGCCGCTCTA TTCCAAAGGA ATCATGGATT AGACCCAAGG 1440 
GCTG AGCTOC TCTAGGGCAG GACCTGCACC CTGTGTGTTG GCACCAGCAT CGGGTCTTGG 1500 
ACTGGGGCAG AATCCCCAGT GGAACCGGAA GAGCTGGACTGATGAGAAAC ATCAGAAGAA 1560 
CACATACTAC CTTCTTTTCC TAATGCCAGA AGGGTGACCA GTGAAGATTC ACCGTCAAAC 1620 
CATGAAAGTC CnTCTTGGA TCCACTTTAT CTTGATTAGT CTGCATTTTA CTAGTTCACT 1680 
GGATCCCTCC TCTAGGGGCC TGGGG ACTTT CACTG ATGCT CTTCCTGATT CTAG AGCAAA 1740 
GGTGTGGGAA GGGGAAATGG AGGAATGCCC TCCTGTCTGT GTCGTTCTCT GTGCCACAGC 1800 
TACAGATGCA GAAGGTTTCT CTGGATAGCA CACCTCTGAA TGTAAATCAT GATAAAATGG 1860 
ATATTTGGAA ACTTACTCCT AAGCTGTG AT GTAGGGTGTA TTTCTACTTC TGGACTGCCT 1920 
CAATATCAAG GGCTGAGACT TTTGAATGTT GAATATTCGT TGGGTTTCAT GTTAAGACGC 1980 
CTG TGGTCC A GGA GTCCT AT TCAGTGTTTC TGTTCCTGAT AAACACTTTG AATATTTTTT 2040 
TGTGTTTTTG TTTCCTTTTC TG AAGCTGTT CCTCCTTTTA AATATTTTTA ATCACATTGA 2100 
TAAAATCTAT CCTTCATCCA OCTCTGGTTC TACTATAGTT GATTTTTATT TTAAATGTTT 2160 
AATTGTATTT G ATTAAACAC TTAACTGGAT TTTGG AAT AA TAAAACTCTC GTCCAATTTG 2220 
GCTTTT AAAA AAAAAAAA 



gWIPrffllWPFHTPwWnwquTO; 
Protein Accession «: NPJE51 99.1 



1 II 21 31 41 51 * 
I I I I I I 

MLWSGCRRFG ARLGCLPGGL RVLVQTGHRS LTSODPSMG LNEEQ KEFQK VAFDFAAREM 60 
APNMAEWDQK ELFPVDVMRK AAQLGFGGVY IOTDVGGSGL SRLDTSVIFE ALATGCTSTT 120 
AYKMNMCA WMIDSPGNEE QRHKFCPPLC TMEKFASYCL TEPGSGSDAA SLLTSAKKQG 180 
DHYILNGS KA HSG AGESDI YWMCRTGGP GPKGISCTW EKGTPGLSFG KKEKKVGWNS 240 
QFTRAVIFED CAVPVANR2G SEGQGFIiA V RGLNGGRINI ASCSLGAAHA SV2LTRDHLN 300 
VRKQFGEPLA SNQYLQFTLA DMATRLVAAR LMVRNAAVAL QEERKDAVAL CSMAKLFATD 360 
ECFAICNQAL QMHGG YGYLK DYA VQQYVRD SRVHQILEG5 NBVMRHJSR SLLQE 



SEQ D> N0:131 PFH6 DNA SEQUENCE 

Nx&c Acid Accession*: KM_0 13989 

Coding sequence: 707-1 105(unden1ned sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I I I I I 

GCCTGCAGAG AGAGGCACTT TGCACCACAG ACAGATAGCA AGAAGGGAAA GACAGAGAGT 60 
G AGAAAAAAG AGG AGTCAGT CGCTCCTGGG G AAGGG AGAG AGTGAGACTG GGAGAAAGAG 120 
AAGCACAGAA AGTGTGTGTA AAACGGAGTA AAGAAAGAAA AAAAAAAAAC TACCCTTAAA 180 
GCACATTTAA AAAAAAAAAA CTCTGGCAAT TCAAGAAAGA AACAGGCTAC GTTTAAAG AG 240 
CATAGAG ACA ATGAAAGGCT AAAGAAAATT TTAAAATCTC TGCCACAGTC TCATAGGTGC 300 
TTGGAAATGA AAGTAGAACT GCCTGTCTTT AACGGACTCT GACAGAGGTA ACTGGATTAG 360 
GGACGAGTAC GCCAGCTTTT liilHlllI 1 1 11 1 111 1 1 TTTAACATCT TAAATCCTGA 420 
AAAAAAAAAA AAAAAAAAAA AAAAGGCAGC AGCTCCGAAT TGAATG AATT GATGGGCACA 480 
CTCCAACTGC TGGGCTGGAG AGACTGGACT TAGTCTTGCC ATTTCTGCTT CTTTGAAAGA 540 
GGAGACAACT TGGG C T TC CT T T TAATTTAG 1111111 1<J C CCTTCTCCCC CAACCCCCAA 600 
CCTTOCCCCT TACCTCCCCC ACCCCCnTA TCACCACCCC CCmTA AAT AAGAGGGTGA 660 
AGGGGAACCA GAGCGCACAA GGGAACTGAC TCAGGAGGCA GAGAAGAXQG GCATCCTCAG 720 
OGTAGACTTG CTGATCACAC TGCAAATTCT GCCAGTTTTT TTCTCCAACT GCCTCTrcCT 780 
GGCTCTCTAT G ACTCGGTCA TTCTGCTCAA GCACGTGGTG CTGCTGTTGA GCCGCTCCAA 840 
GTCCACTCGC GGAGAGTGGC GGCGCATGCT GACCTCAGAG GGACTGCGCT GCGTCTGGAA 900 
GAGCTTCCTC CTCGATGCCT ACAAACAGGT GAAATTGGGT GAGGATGCCC CCAATTCCAG 960 
TGTGGTGCAT GTCTCCAGTA CAG AAGGAGG TG ACAACAGT GGCAATGGTA CCGAGGAGAA 1020 
GATAGC TGAG GGAGCCACAT GCCACCTTCT TGACTTTGCC AGCCCTGAGC GCCCACTAGT 1080 
GGTCAACTTT GGCTCAGCCA CTJjQACCTCC TTTCACG AGC CAGCTGCCAG CCTTOCGCAA 1140 
ACTGGTGGAA GAGTTCTCCT CAGTGGCTGA CTTCCTGCTG GTCTACATTG ATGAGGCTCA 1200 
TCCATCAGAT GGCTGGGCGA TACCGGGGGA CTCCTCTTTG TCTTTTGAGG TGAAG AAGCA 1260 
CCAGAACCAG GAAGATCGAT GTGCAGCAGC CCAGCAGCTT CTGGAGCGTT TCTCCTTGCC 1320 
GCC CCAG TGC CGAGTTGTGG CTGACCGCAT GGACAATAAC GCCAACATAG CTTACGGGGT 1380 
AGCCTTTGAA CGTGTGTGCA TTGTGCAGAG ACAGAAAATT GCTTATCTGG GAGGAAAGGG 1440 
CCCCTTCTCC TACAAOCTTC AAGAAGTCGG GCATTGGCTG GAGAAGAATT TCAGCAAG AG 1500 
ATGAAAGAAA ACTAGATTAG CTGGTTAAAG GTATGATTAT AAGAGAGCTT ATTGTTTTAA 1560 
AAAGTTATAT AAAGGCAAGG AAATTAAGAA CTG AATCCAT ATTTCAACAG AGCCCTATTG 1620 
GCTTACTGAA AGACAGGAGT TTATCTATCG GAAGAACATG AATCTCTAAC AGCTCCATAC 1680 
TTCTTTCACT ACTCAAATGO CATTGGGCTG AGTAAGTAAC CATATCACCT CTCTTCTTAG 1740 
TAAAAAGCCC TATGTGAAAA GATCCCAAG A TGGAGAGGAA GAAACGCTAA TTCAGCATGT 1800 
GTTCATTCTG CATTGAGAAG GAACTGATAC ATCTGATGCA TGCTTTGAGA CCAGAAGAAA 1860 
AG ACTTACCT G AAT AATT AC T AC ATTAGGG AAGCT ACTGT CTACGTTAAG AT AAAGGGTA 1920 
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TTGCCTTGGC TCTATTTGGC ATCGATGGAG CCCAGTTOOA AAATTCCCAA ATATTACAAC 1980 
AAGTCCTTGA ACCCAGGCCA TGTOOTTAOA CGTTGGTGTT AAGGTTAGAC CTTATGTTAG 2040 
AGTCATTTCT GATGTTCCAG CTTCTAGCCA TGTAGTGCTC TCAGTCTTCA TACCCCAGAA 2100 
ATTATTGGTA TATTTGTAGA TACCGAGAATGATCCCTCAG TCTGAGAGGT TAGAATGATC 2160 
ATCTGTAATC TGAGGGTTAA TTTCTAGGCA GGTGGAGAGA GTGGTAAAAA AGAAATG AAA 2220 
TTGACAAGCT AGGAAAGAGG AGGCAGAAAG ATTTGGAAAA TTCACAGAGT TTCACCCTTA 2280 
AGCTGTAGAG AGTGGGTCAC ATTTGTTAGC CACGGAAACA TAGAAACATA CACAAGGCCA 2340 
GAAAAAGAAG AAGGAGCTCA ACTAAAAGTG GCATAGAGAA TACACATATA AAAACAATAT 2400 
ATTTGTCATA TGCTCCTAGA GAGGAGAAAG GGGTGATTGA AAGAAAAAAA AATACTTAAA 2460 
TATTTGTAAT TGTG AGGGGT TTCTTTTGGA AATAATTACT TTTG AACCAT GTATGTGGTA 2520 
TGTATATTTT CAGTGGGTTA ATTATACCCC ATGATACCTA TTAAAGGAAA ACCAGTGGGT 2580 
CTGGTGGTGC TGGTCTTTTC CTCCCCATTC CTACAATTTC TATGTGGCOC AAGTCATTCC 2640 
TAATCTTGGT CTCTATAGCA GTGTTCTCT C TGAATGCTGA GCTG AAGAAA TTATACGTAC 2700 
ATACACACAT ACATACATAC ATACAAATAT ATGTATATAT ATTCTCAGCT GCTGOGGGAG 2760 
GTAGGTACCA TGGCCATTCA GCACAGCCTT GATTTCCTCC CAAAGTAGGT GAGCTATAGT 2820 
GAAGAATAGG TGCAAACAAA CAAGCTTACT TCCATTGCAA AATAGAAGAA GAGGAAGTTA 2880 
GAG ATAATTC TGATCAATCA TrTTGGAGGC TTTGTTATAA GGCAAGOOCC GGTATATCAT 2940 
GGAATTTCCA TTGACATTTG AATTTGGACT TGGATCTTCC CTTGGTCCCA TTAGCTGAGG 3000 
TTTAGTAATC TAAAGTCCCT ATAGTATATG ATTATAATGC TATTTTAAAA AATATATATA 3060 
TAAAATATTT TTTTCTTTTT AAAATAGACA CTATAGTTTT ACCCATAAGT AATATTTAAA 3120 
GATTATAGCT CCCAAAAG AA TGGACCAACC ACTTTCGTAT CATAATTTCT TTTTGGTAAA 3180 
TATGAG ACTA TTATGAAATC ATAGTATATG ATTGTATTTA AAGGTACAAT CAAAGGATCT 3240 
TTTGTCCATT CCATTAATAA CTGAATAAAA AATAAATAAA ATGGATAGAA AAAAACTAAA 3300 
GTTGAAAATA CATTCTTAAA CTAGTTGTCT GAAATGAGAA AAGAGTGAGA ACTAGGTGTG 3360 
CAAGAACCAA ACGTATTTTA TTTTATTTTT TAAATGGGAG CAACATATCA GTCGTGTCAC 3420 
CAGCTGGTAT ATTGTGTAAA TATTAAAGCT CCATTGGGAC TGATTTTTCA TGGCAACATC 3480 
AGCTTTCTAA TGTTCTAAAT TCTATAAAAA CCACCCACAA AGAAACAAAG CAAATTTCAT 3540 
TATCTAATGA GTTGCTGGAA AATCATATTG AGAATAATTA TTTCAGATTC CTCAGTTGTT 3600 
AACTTCTACA TTCAAGGGCT TATCTCTGCC CCCATTGATT TTTAACCTCA AAATGGTGTG 3660 
AGATTTACTG TGGAACCCTA AAGCAGTAAA ATAAAAAACC TGGTTGCAGC ACATTCACAC 3720 
TGTTGTCCTT AAAATTCCCC TTTTTTCTCT ATGTACGATA AAGTAACAGT ATGTCAGATA 3780 
AGCCGGTGGG GGGATGAGAT TAGGCTGAGG CAGTGCTAGT CAACTGGGGG AAAAGGATGA 3840 
TGGAAAAATC ACCCAGTTGT GCTATATTTT TAAAGAAGGA GGTCGTTTAT GTGTGCAG AC 3900 
AATTCTCCCT GAGGTTAGCC CAATGG AG AA ATGAAGCAGA GG AAGGAAAC ATAG AAAGAC 3960 
ATGGGCTATC AGGGAGG AAG ATGTTCAATA GAACATGCAA GAATTTCTGG AAGAAAGGCT 4020 
GTGGAAGGGC CAATGGAGAA AATGAATGGA CAAAGCTCAG GAATCCCTAC GCTATGTAGA 4080 
ATGTTCTTGG TGTTATCAGG GTTAAGCCCT GTAATTATGT AACCTATTTA TCGCAACATG 4140 
AATTTTTATG ATTTCTTGTG ATGTATTCTT TTATGAAATT AACAAGAACT CATTATTTTG 4200 
AGGTAGAGGA AAATCAATGC TTTATCTGAT ATGCTGAGAA ATTATTAG A T TGCC AATACT 4260 
CATGTGCGTT TCATGTGTTT TATAAGGTTT GTTCCTTTGA AG AATTGTAG TTCTTAGTCC 4320 
CACAGGGAAA TGTGTATCTA TTTATATATC ATAGTATAAA TCTATGATAT ATTTATATCA 4380 
TATATAAAAG TCTGAGTTCT CTTTCTTAGT CCCTAATCAT GTTTCTOCCA TAGGCTGTGT 4440 . 
TTACATGG AG CTATCGGXTT AGCCTTTTAA GCTTCATTAG CTTGTCTATT ATTGAAATAG 4500 
TTTCCAAGAA ATTTTAG ATA TTATCATAAC ATCTGGGTCT ACTCAAACAC TTATTGTTTG 4560 
AAAGACTTAT GTCTTGGACC TATCAAAAAC TGACTTTATT TATTGCTTAG TGAAAATACT 4620 
AGTGGGATCA ACAATG ATTT TCTTGAATGG GCATGAATGG AGATGCCCGC ACAGTAATGT 4680 
AGAAATGTTT CATACAGCTA TTAAAATGTA ACTGACCTCC TTAGAGGCAG ATTAGT AACT 4740 
GTTCCTACTT TGTATAGCTA AGTG ACAGTC ACTTAACTTA CATGACTTTC TTTTTTCACA 4800 
TrGGGTCTCT GGTCCTGTGT CTTCACCTCA TTTATAGCAC GTCTCCTTG A TTTTTGGTAG 4860 
TATCAACTTC CCAGTGATCT GTTCAGTTAA GTTCTTCTCC CGTTAACCAG GAAGTGCTTA 4920 
TTCTCTCATC ACAGTGGGAA GAATAGCCTA TTGTCTTTCA TTTTGCCTG A GTGTATTTTA 4980 
CTATTTGGGC TCTG AAATAA AAATTATGAA ATATGGTG AG GTCACATGTT GGTGCTGCCT 5040 
TGCTGCATAA AATTCTAGG A GGGCAGGTTA GG AGACAGTT ATGTATGGCC TTTCGGGAAA 5100 
ATTCAAAGGG TGGGATTACA AGGGTGTTCC TCAGGCATGC CCCTATGGGC CCTATGTGGA 5160 
AGCAAG AAG A ATTG ACTGAT TTACAGG ACT TCTCTTTATG TCAATCTTAA GAGG ATGG AT 5220 
G AATCTGG AC ATTTGTTCCA CCCGACCTCT GACTGATGGT TTGGAAAATA ACTTTAATTA 5280 
GGATCATATG ACCATTGAAA AAGOAAAAAT GTAG ACTCTG ACTTCCGTCC CACTGAAGGA 5340 
TTAATGAAAA CCTTTACTAO CATTT AOAGC TTTTCAGAAC ATCCCCACTG TCATGTGTCT 5400 
CAGCAGTGGA G ACTGCAAGT AAGGCTTTTA ATTTTAGGAG GUI ill TIT TTTTTTTTTT 5460 
TTCCCCTAAA TGGTATGGCC AAAAGTCAGA GTTAAAATAT ATATAGTTAG ATTCGAACTT 5520 
CCTCCTTCAC TCTAAAAATA GAATCCAAAC CCACTCTTCA TATATGCTTC CAGAATGGGG 5580 
CTTAAGTACC AATCTCTGCT TTGCAATGGG CACAATCTTG GTCATGTCCTGAGGCTCTCT 5640 
AAGAAAAGAG AGGATCTAGG ATGGGAGAGC TAGAAAGTTG CTAACTGGGA AGAACAAGGC 5700 
CCTGAGGGGT TGGTCTACCA ATCTGGGAAG ATTTGAAAAC AAACTTCTCG CAACTG AAGG 5760 
A AGGCTG AAG GCTGCTGCAA GTCATTGAGT GACTTTAGGA TGAGCAAAAC ATTGGGCCAC 5820 
TTCCTAATGC CCTATGTGTA TAGTACCAG A AGCAAGGTCT CAGACTTAAC AGACCCAGCT 5880 
CTGTTCCAAG GTGAGTCTG A ACCAATAGAA AGCAAACATG TGCAGATATC CAAACAAGAC 5940 
TGCTCATGCA AGTCGGGGCT GGCTACCCGT CTTAGGCAGC AACAGCAGAG CTCCAGGGAG 6000 
CTTATTCAAT ATTTACTGAG ACTTCGAAGA CCCAGCAGAT GTTTAATGAA GTCACTATTT 6060 
TGGCTCAAAC CCTCCACTTC TCCCCCTCCC CTCAAAAAGC CAACAGGTAA ACACATAAAT 6120 
GAAAGAAACC CACAGA AGGG GATGGGAAAT AAAGAAAATT CTCTCAAGAC TTCTCCAGGC 6180 
CCATGTCACT GGTCAGCGTG GTTTTTATGT GTATTAGGAT TGGGGGATGT GAAGAAATAA 6240 
GTATCCAGTA CTTTATAACC AAAGCAATTA AATCATATTG GGGTAGGGAA TGTTGGCCAG 6300 
TTTTGTTTAG TTTTGCCATC ACATTGTCAC CCAGACCTCA CCTAGCCCCA AGTAATCGGG 6360 
CCCCCCGAAG AGGG AG ACAG AG ATGTGCCA GAGTTGACCC AGTGTGCGGA TGATAACTAC 6420 
TO ACGAAAGA GTCATCGACC TCAGTTAGTG GTTGGATGTA GTCACATTAG TTTGCCTCTC 6480 
CCCATCTTTG TCTCCCTGGC AAGGAG AATA TGCGGGACAT GATGCTAAGA GCCCTGGGTA 6540 
AATGTGGTGA GAATGCACGC GTGCATATGC TACACATATG TGCTTCTCAG TTGCAGAAAA 6600 
TGAACTGCTT TGGG AGATTA TCAGTAGA AA G AGTGTTATC ATATTGGTGC TGAGTGCTAT 6660 
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GTGTGCTTAT ACAATTTGTT CTTGTATTTT AATAAACTTT OAATAAAAGA ATAAAAAAAA 6720 
AAAAAAAAAA AAAAA 



gEfi IP mm?m Prolan mnm 

Protein Accession*: NP.054644.1 

1 11 21 31 41 51 
I I I I I i 

MGILSVDLLI TLQILPVFFS NCLFLALYDS VUXKHWLL LSRSKSTRGE WRRMLTSEGL 60 
RCVWKSFULD AYKQVKLGED APNSS WHVS STEGGDNSGN GTQEKIAEGA TCHLLDFASP 120 
ERPLWNFGS ATXPPFTSQL PAFRKLVEEF SSVADFLLVY IDEAHPSDGW AIPGDSSLSF 180 
EVKKHQNQED RCAAAQQLLE RFSLPPQCRV VADRMDNNAN IAYGVAFERV CIVQRQK1AY 240 
LGGKGPFSYN LQEVRHWLEK NFSKRXKKTR LAG 



SEQ ID KO;133 PFH5 DNA SEQUENCE 

Nxleic Add Accession* NM.001141 

Coring sequence: 72-2102 (underiined sequences correspond to start and stop oodons) 

1 11 21 31 41 SI 
I I I I I I 

CAGGCGTGTC CCAGGGGGAG (XCCGCTCTG CAGCCCTGTG CGCCGTAGAG AGCTGGACTT 60 
AGGCTGGCAG CAJGGCCGAG TTCAGGGTCA GGGTGTCCAC CGGAGAAGCC TTOGGGGCTG 120 
GCACATGGGA CAAAGTGTCT GTCAGCATCG TGGGGACOOG GGGAGAGAGC CCCOCACTGC 180 
OCCTGGACAA TCTCGGCAAG GAGTTCACTG CGGGCGCTGA GGAGGACTTC CAGGTGACGC 240 
TCCCGGAGGA CGTAGGOCGA GTGCTGCTGC TGCGCGTGCA CAAGGCGCCC CCAGTGCTGC 300 
OCCTGCTGGG GCCCCTGGCC CCGGATGCCT GGTTCTGCCG CTGGTFCCAG CTGACACOGC 360 
CGCGGGGCGG CCACCTCCTC TTCCCCTGCT ACCAGTGGCT GGAGGGGGCG GGGACCCTGG 420 
TGCTGCAGGA GGGTACAGCC AAGGTGTCCT GGGCAGACCA CCACCCTGTG CTCCAGCAAC 480 
AGCGCCAGGA GGAGCTTCAG GCCCGGCAGG AGATGTACCA GTGGAAGGCT TACAACCCAG 540 
GTTGGCCTCA CTGCCTGG AT GAAAAGACAG TGGAAGACTT GGAGCTCAAT ATCAAATACT 600 
CCACAGCCAA GAATGCCAAC TTTTATCTAC AAGCTGGCTC TGCTTTTGCA GAGATGAAAA 660 
TCAAGGGGTT GCTGG A(XGC AAGGGGCTCT GG AGG AGTCT G AATG AG ATG AAAAGGATCT 720 
TCAACTTCCG GAGG ACCCCA GCAGCTGAGC ACGCATTTGA GCACTGGCAG GAGGATGCCT 780 
TCTTCGCCTC OCAGTTOCTG AATGGTCTCA ACOCTGT0CT GATOCGCCGC TGTCACTACC 840 
TCCCAAAGAA CTTCCCCGTC ACTG ATGCCA TGGTGGCCTC ATTGTTGGGT CCTGGGACCA 900 
GCTTGCAGGC TGAGCTAGAG AAGGGCTCCC TGTTCTTGGT GGATCACGGC ATCCTCTCTG 960 
GCATCCAGAC CAATGTCATT AATGGGAAGC OGCAGTTCTC TGCGGOCCCA ATGACCCTGC 1020 
TATACCAGAG COCAGGCTGC GGGCCGCTGC TGOCTCTCGC CATCCAGCTC AGCCAGACCC 1080 
CCGGCCCAAA CAGCCCCATC TTCCTGCCCA CTGATGACAA GTGGGACTGG TTGCTGGCCA 1140 
AGAOCTOGGT GOGCAATGOC GAGTTCTCCT TCCATGAGGC CCTCACGCAC CIGCTGCACT 1200 
CACATCIGCT GCCTG AGGTC TTCACCCTGG CTAOOCTGCG TCAGCTGCCC CACTGCCACC 1260 
CTCTCTTCAA GCTGCTGATC CCGCACACOC GATACACGCT GCACATCAAC ACACTCGCCC 1320 
GGGAGCTGCT TATCGTGCCA GGGCAGGTGG TGGACAGGTC CACAGGCATC GGCATTGAAG 1380 
GCTTCTCTGA GTTGATACAG AGGAACATGA AGCAGCTGAA CTATTCTCTC CTGTGTXTGC 1440 
CTGAGGATAT CCGGACCCGA GGAGTTGAAG ACATCCCAGG CTACTACTAC CGTGATGATG 1500 
GGATGCAGAT TTGGGGTGCA GTGGAACGCT TTGTCTCTGA AATCATCGGT ATCTACTACC 1560 
CAAGTGATGA GTCTGTCCAA GATGACAGAG AGCTCCAGGC CTGGGTCAGA GAGATCTTCT 1620 
CCAAGGGCTT CCTAAACCAG GAGAGCTCAG GTATOCXnTC CTCACTGG AG ACCCGGGAAG 1680 
CCCTGGTGCA GTATGTCACC ATGGTGATAT TCAOCTGCTC AGCCAAGCAT GCGGCTGTCA 1740 
GTGCAGGGCA GTTTGACTCC TGTGCTTGGA TGCCCAACCT GCCACCCAGC ATGCAGCTGC 1800 
CACCACCCAC CTCCAAAGGC CTGGCAACAT GCGAGGGCTT CATAGCCACC CTCCCACCTG 1860 
TCAATGCCAC ATGTGATGTC ATOCTTGCTC TCTGGTTGCT G AGCAAGGAG CCTGGAGACC 1920 
AAAGGCCOCT GGGCACCTAT CCGG ATGAGC ACTTCACAG A GG AGGCCCCT OGGGGGAGCA 1980 
TCX3CCACCTT CCAGAGCCGC CTGGCCCAGA TCTCGAGGGG CATCCAGGAG CGGAACCGGG 2040 
GCCTGGTGCT GCCCTACACC TACCTAGACC CTCCCCTCAT CGAGAACAGC GTCTCCATXI 2100 
AAATCCCAGG GGAACACAGG CCCAGATGAC ATCCCTTTGA CCACATCGCT CTAGGATAAC 2160 
TGGCACOCAG AGAAAAGGAC TCCTCAGAAA AAACAGGCCC CCATGTGCCT CTCCTGGGAC 2220 
AACCAGACTC TGTAACTCAC CCCCACCACC ATACACACAC ACAAAAACAG AAACAAAATC 2280 
AAAACAGAGA AAGCAGAAAA TCTACCAAGA ACAGAGTCTC AGGACAGAAC CACTGAGTCT 2340 
TTTGGAGGCT CCAAGCCTCA AAGTGCCCGC AGAGCGCACC TTGAGGGTTT TGCTAGTTGG 2400 
1 1 1 rGTTTTD CGTTTACAGC OOTGGGGGGA AGCACATAAT CCCGCCCCAG GOCCCACTAG 2460 
CATCCACTGA TTGGACCTTA TGGTCACCCA ACTCAAGGAC AGOCACCAAG AAGTGGCTGC 2520 
CAAAGAGACT GGGCGCAGTG GCTCATXJCCC ATAATOCCAG CACTTTGGGA GATGGAGGCG 2580 
GGAAAATCAT TTGAGGTCAG AAGTTCAAGG CCAGCCTGGA CGACAT AGCG AGACTCCACC 2640 
TCTACCAAAA AATAAAAATT AAAAAACAAA AAAAAAAAAA AAAAA 



SEQIDKO:134PFH5PiDtehseouerre: 
Protein Accession*: NP.OOt 132.1 

1 11 21 31 41 51 
I I I I I I 

MAEFRVRVST GEAPGAGTWD KVSVSIVGTR GESPPLPLDN LGKEFTAGAE EDFQVTLPED 60 
VGRVLLLRVH KAPPVLPLLG PLAPDAWFCR WPQLTPPRGG HLLFPCYQWL EGAGTLVLQE 120 
GTAKVSWADH HPVLQQQRQE ELQARQEMYQ WKAYNPGWPH CLDEKTVEDL ELNIKYSTAK 180 
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NANFYLQAGS AFAEMKIKGL LDRKGLWRSL NEMKRIFNFR RTPAAEHAFE HWQEDAFFAS 240 
QFLNGLNPVL KRCHYLPKN FPVTDAMVAS LLGPGTSLQA ELEKGSLFLV DHGILSGIQT 300 
NVINGKPQFS AAPMTLLYQS PGCGPLLPIA IQLSQTPGPN SPIFLPTDDK WDWIJLAKTWV 360 
RN AEFSFHEA LTHLLHSHLL PEVFTLATLR QLPHCHFLFK LUPHTRYTL HINTLARELL 420 
IVPGQWDRS TGIGIEGFSE UQRNMKQLN YSLLCLPEDI RTRGVEDIPG YYYRDDGMQI 480 
WGAVERFVSE DGIYYPSDE 5VQDDREUQA WVREffSKGF LNQESSGIPS SLETREALVQ 540 
YVTMVIFTCS AKHAAV5AGQ FDSCAWMPNL PPSMQLPPPT SKGLATCEGF IATLPPVNAT 600 
CDVttALWLL SKEPGDQRPL GTYPDEHFTE EAPRRSIATF QSRLAQISRG IQERNRGLVL 660 
PYTYLDPPU ENSVSI 



SEQ ID N&135 PFH4 DNA SEQUENCE 

Nucleic Add Accession* NM.002742 

Coding sequence: . 236-2974 (underflned sequences ccrcespond to start and stop codons) 



1 11 21 31 41 51 

GAA'l fOCI 1C TCTCCTCCTC CTOGCCCTTC TCCTCGCCCT CCTCCTCCTC CTCGCCCTCC 60 
CCTCCCGATC CTCATCCCCT TGCCCTCCCC CAGCCCAGGG ACTTTTCCGG AAAGTTTTTA 120 
TTTTCCGTCT GGGCTCTCGG AGAAAGAAGC TCCTGGCTCA GCGGCTGCAA AACTTTCCTG 180 
CTGCCGCGCC GCCAGCCCCC GCCCTCCGCT GCCCGGCCCT GCGCCCCGCC GAGCGATQAG 240 
CGCCCCTCCG GTCCTGCGGC CGCCCAGTCC GCTGCTGCOC GTGGCGGCGG CAGCTGCCGC 300 
AGCGGCCGCC GCACTGGTCC CAGGGTCCGG GCCCGGGCCC GCGCCGTTCT TGGCTCCTGT 360 
CGCGGCCCCG GTCGGGGGCA TCTCGTTCCA TCTGCAGATC GGCCTGAGCC GTGAGCCGGT 420 
GCrcCTGCTG CAGG ACT0GT CCGGGG ACTA CAGCCTGGCG CACGTCOGCG AGATGGCTTG 480 
CtOCATTGTC GACCAGAAGT TCCCTGAATG TGGTTTCTAC GGAATGTATG ATAAGATCCT 540 
GCTTTTTCGC CATGACCCTA CCTCTG AAAA CATCCTTCAG CTGGTG AAAG CGGCCAGTGA 600 
TATCCAGGAA GGCG ATCTTA TTGAAGTGGT CTTGTCACGT TCCGCCACCT TTGAAGACTT 660 
TCAGATTCGT CCCCACGCTC TCTTTGTTCA TTCATACAGA GCTCCAGCTT TCTGTGATCA 720 
CTGTGGAGAA ATGCTGTGGG GGCTGGTACG TCAAGGTCTT AAATGTGAAG GGTGTGGTCT 780 
GAATTACCAT AAGAGATGTG CATTTAAAAT ACCCAACAAT TGCAGOGGTG TGAGGCGGAG 840 
AAGGCTCTCA AACGTTTCCC TCACTGGGGT CAGCACCATC CGCACATCAT CTGCTGAACT 900 
CTCTACAAGT GCCCCTGATG AGCCCCTTCT GCAAAAATCA CCATCAG AGT CGTTTATTGG 960 
TCGAGAGAAG AGGTCAAATT CTCAATCATA CATTGGAOGA CCAATTCACC TTGACAAGAT 1020 
TTTGATGTCT AAAGTTAAAG TGCCGCACAC ATTTGTCATC CACTCCTACA CCCGGCCCAC 1080 
AGTGTGCCAG TACTGCAAG A AGCTTCTGAA GGGGCTTTTC AGGCAGGGCT TGCAGTGCAA 1140 
AGATTGCAGA TTCAACTGCC ATAAACGTTG TGCACCGAAA GTACCAAACA ACTGCCTTGG 1200 
CGAAGTGACC ATTAATGGAG ATTTGCTTAG CCCTGGGGCA GAGTCTGATG TGGTCATGGA 1260 
AGAAGGGAGT GATGACAATG ATAGTGAAAG GAACAGTGGG CTCATGGATG ATATGGAAGA 1320 
AGCAATGGTC CAAGATGCAG AGATGGCAAT GGCAGAGTGC CAGAACGACA GTGGCGAGAT 1380 
GCAAGATCCA GACCCAGAOC AOGAGGACGC CAACAGAACC ATCAGTCCAT CAACAAGCAA 1440 
CAATATCCCA CTCATGAGGG TAGTGCAGTC TGTCAAACAC ACGAAGAGGA AAAGCAGCAC 1500 
AGTCATGAAA GAAGGATGGA TGGTCCACTA CACCAGCAAG GACACGCTGC GGAAAGGGCA 1560 
CTATTGGAG A TTGGATAGCA AATGTATTAC CCTCTTTCAG AATG ACACAG G AAGCAGGTA 1620 
CTACAAGGAA ATTOCTTTAT CTGAAATTTT GTCTCTGGAA CCAGTAAAAA CTTCAGCTTT 1680 
AATTCCTAAT GGGGCCAATC CTCATTGTTT CG AAATCACT AOGGCAAATG TAGTGTATTA 1740 
TGTGGGAGAA AATGTGGTCA ATCCTTOCAG CCCATCACCA AATAACAGTG TTCTCACCAG 1800 
TGGOGTTGGT GCAGATGTGG CCAGGATGTG GGAG ATAGCC ATOCAGCATG COCTTATGCC 1860 
CGTCATTCCC AAGGGCTCCT CCGTGGGTAC AGG AACCAAC TTGCACAGAG ATATCTCTGT 1920 
GAGTATTTCA GTATCAAATT GCCAGATTCA AGAAAATGTG GACATCAGCA CAGTATATCA 1980 
G ATTTTTOCT G ATGAAGT AC TGGGTTCTGG ACAGTTTGGA ATTGTTTATG GAGG AAAACA 2040 
TCGTAAAACA GGAAGAGATG TAGCTATTAA AATCATTGAC AAATTACG AT TTCCAACAAA 2100 
ACAAGAAAGC CAGCTTCGTA ATGAGGTTGC AATTCTACAG AACCTTCATC ACCCTGGTGT 2160 
TGTAAATTTG GAGTGTATGT TTGAGACGCC TGAAAGAGTG TTTGTTGTTA TGGAAAAACT 2220 
CCATGOAGAC ATGCTGGAAA TGATCTTGTC AAGTGAAAAG GGCAGGTTGC CAGAGCACAT 2280 
AACGAAGTTT TTAATTACTC AGATACTCGT GGCTTTGCGG CACCTTCATT TTAAAAATAT 2340 
CGTTCACTGT GACCTCAAAC CAGAAAATGT GTTGCTAGCC TCAGCTGATC CTTTTCCTCA 2400 
GGTG AAACTT TGTGATTTTG GTTTTGCCCG GATCATTGGA GAGAAGTCTT TCCGGAGGTC 2460 
AGTGGTGGGT ACCOCX3GCTT ACCTGGCTCC TGAGGTCCTA AGGAACAAGG GCTACAATCG 2520 
CTCTCTAGAC ATGTGGTCTG TTGGGGTCAT CATCTATGTA AGCCTAAGCG GCACATTCCC 2580 
ATTTAATGAA GATGAAGACA TACACGACCA AATTCAGAAT GCAGCTTTCA TGTATCCACC 2640 
AAATCCCTGG AAGG AAATAT CTCATG AAGC CATTGATCTT ATCAACAATT TGCTGCAAGT 2700 
AAAAATGAG A AAGCGCTACA GTGTGG ATAA GACCTTG AGC CACCCTTGGC TACAGGACTA 2760 
TCAGACCTGG TTAGATTTGC GAG AGCTGG A ATGCAAAATC GGGGAGCGCT ACATCACCCA 2820 
TGAAAGTG AT GACCTGAGGT GGGAG AAGTA TGCAGGCGAG CAGCGGCTGC AGTACCCCAC 2880 
ACACCTGATC AATCCAAGTO CTAGCCACAG TGACACTCCT GAGACTGAAG AAACAGAAAT 2940 
GAAAGCCCTC GGTGAGCGTG TCAGCATCCT CJgAGTTCCA TCTCCTATAA TCTGTCAAAA 3000 
CACTGTGGAA CTAATAAATA CATACGGTCA GGTTTAACAT TTGCCTTGCA GAACTGCCAT 3060 
TATTTTCTGT CAGATGAGAA CAAAGCTGTT AAACTGTTAG C ACTGTTG AT GTATCTG AGT 3120 
TGCCAAGACA AATCAACAGA AGCATTTGTA TTTTGTGTGA CCAACTGTGT TGTATTAACA 3180 
AAAGTTCCCT G AAACACGAA ACTTGTTATT GTGAATGATT CATGTTATAT TTAATGCATT 3240 
AAACCTGTCT CCACTGTGCC TTTGCAAATC AGTGTTTTTC TTACTGGAGC TTCATTTTGG 3300 
TAAGAGACAG AATGTATCTG TGAAGTAGTT CTGTTTGGTG TGTCCCATTG GTGTTGTCAT 3360 
TGTAAACAAA CTCTTGAAGA GTCGATTATT TCCAGTGTTC TATGAACAAC TCCAAAACCC 3420 
ATGTGGG AAA AAAATGAATG AGG AGGGTAG GG AATAAAAT CCTAAGACAC AAATGCATGA 3480 
ACAAGTTTTA ATGTATAGTT TTGAATCCTT TGCCTGCCTG GTGTGCCTCA GTATATTTAA 3540 
ACTCAAGACA ATGCACCTAG CTGTGCAAGA CCTAG TGCTC TTAAGCCTAA ATGCCTTAG A 3600 
AATGTAAACT GCCATATATA ACAGATACAT TTCCCTCTTT CTTATAATAC TCTGTTGTAC 3660 
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TATGGAAAAT CAGCTGCTCA GCAACCTTrC ACCTTTGTGT ATTTTTCAAT AATAAAAAAT 3720 
ATTCTTGTCA AAAAAAAAAA AA 



5 ffifllPHftfflPFmPwWPtWMTO; 
Protein Accession #: NP_00273ai 



1 11 21 31 41 51 

10 | | | i | | 

MSAPPVLRPP SPLLPVAAAA AAAAAALVPG SGPGPAPFLA PVAAPVGGIS FHLQIOLSRE 60 
PVLLLQDSSG DYSLAHVREM ACSIVDQKFP ECGFYGMYDK ILLFRHDPTS ENILQLVKAA 120 
SD1QEGDUE WLSRSATFE DFQIRPHALF VHSYRAPAPC DHCGEMLWGL VRQGLKCEGC 180 
GLNVHKRCAF KIPNNCSGVR RRRLSNVSLT GVSTIRTSSA ELSTSAPDEP LLQKSPSESF 240 

15 IGREKRSNSQ SYIGRPIHLD KILMSKVKW HTFVIHSYTR PTVCQYCKKL LKGLFRQGLQ 300 

CKDCRFNCHK RCAPKVPNNC LGEVTINGDL LSPGAESDW MEEGSDDNDS ERNSGLMDDM 360 
EEAMVQDAEM AMAECQNDSG EMQDPDPDHE DANRTISPST SNNIPLMRW QSVKHTKRKS 420 
STVMKEGWMV HYTSKDTLRK RHYWRLDSKC ITLFQNDTGS RYYKEIPLSE ILSTJBPVKTS 480 
AUPNGANPH CFETITANVV YYVGENWNP SSPSPNNSVL TSGVOADVAR MWHAIQHAL 540 

20 MPVIPKGSSV GTGTNLHRDI SVSISVSNCQ IQENVDISTV YQIFPDEVLG SGQFGIVYGG 600 

KHRKTGRD VA IKHDKLRFP TKQESQLRNE VAUJQNLHHP GWNLECMFE TPERVFWME 660 
KLHGDMLEMI LSSEKGRLPE HTTKFUTQI LVALRHLHFK NIVHCDLKPE NVLLASADPF 720 
PQVKLCDFGF ARIIGEKSFR RS WGTPAYL APEVLRNKGY NRSLDMWSVG VDYVSLSGT 780 
FPFNEDEDIH DQIQNAAFMY PPNPWKEISH EAIDLINNLL QVKMRKRYSV DKTLSHPWLQ 840 

25 DYQTWLDLRE LECKIGERY1 THESDDLRWE KYAGEQRLQY PTHLINPSAS HSDTPETEET 900 
EMKALGERVSIL 



30 SEQ ID N 0:1 37 PFH3 DNA SEQUENCE 

NudeicActd Accession* X95425 

Coding sequence: 712-3825 (underlined sequences correspond to start end stop codons) 
1 11 21 31 41 51 

35 | | | | i | 

AATGGTCAGT CAATACATTA TAACATAATA CACCAAATGC TAG AATAGAA GGGGAGGGGG 60 
GCACACATAA TCACTCACTG CTGOAAGAAG GGTGCATCAG TGAATTAAAA AATGTCCCTC 120 
CCCTCTTCAG CACTCAGCGC GCAGCTATTT CCTTCTGCCA GTCTCTTTGA ACTCTGGATC 180 
TTTGCTTTTG CTOGCTGCTC TCCTGTTTTT CATTCTCCAC ATTTTCTCAA TCCTCTTTCT 240 

40 TTATCCTTAG CCACCCTGCT TTTTTCCTCC TTTTTTAAAA AATCGGAGAT TTCGTCTTAA 300 
AATGATTTGT CTTCCTTACC TTCGTCCATT TCAACACTGA AGGCTGCAAA GAACTTCACC 360 
TTTOCCCTAG TGGTATTTAA AAATTCTCAA TCCGTAAAAA GTCTTTTTG A AAGGCAAAGG 420 
AACAGGACCC AGACCCTCTC GACACCCTTG ATCCGAGTCA GATCTGCACT AGCAACCAGA 480 
ACTAATATTT CATTTAACCC ACCAAAAG GG GGAGGCGAGA GGAGOCAGAA GCAAACTTCA 540 

45 TCTGTCTCAG AOGGATCOGT GGTTOCTACA TTTGGAGGAG COGOGTGTCA GAAGGCGTAG 600 
GACGOCAAGG GGGG ACAAGG AGGACTCCCG AGTCTCGCTT CTOCGCICTC CGAGACCGAA 660 
GAGGTGGACT GAGCCGCTCG GG ACAGCGGC ACCGGAGGAG GCTCGGAGAA GATGCGGGGC 720 
TCGGGGCCCC GGGGTGCGGG ACACCGGCGG CCCCCAAGCG GCGGCGGCGA CACCCCCATC 780 
AOCCCAGCGT CCCTGGCCGG CTGCTACTCT GCACCTCGAC GGGCTCCCCT CTGGACGTGC 840 

50 CTTCTCCTGT GOGCCGCACT CCGGACCCTC CTGGCCAGCC CCAGCAACGA AGTGAATTTA 900 
TTGGATTCAC GCACTGTCAT GGGGGACCTG GGATGGATTG CTITTCCAAA AAATGGGTGG 960 
GAAGAGATTG GTGAAGTGG A TGAAAATTAT GCCCCTATCC ACACATACCA AGTATGCAAA 1020 
GTGATGG AAC AGAATCAGAA TAACTGGCTT TTGACCAGTT GGATCTCCAA TG AAGGTGCT 1080 
TCCAGAATCT TCATAG AACT CAAATTTAGC CTGCGGGACT GCAACAGCCT TGCTGGAGGA 1140 

55 CTGGGGACCT GTAAGG AAAC CTTTAATATG TATTACITTG AGTCAGATGA TCAGAATGGG 1200 
AGAAACATCA AGGAAAACCA ATACATCAAA ATTGATACCA TTGCTGCCGA TGAAAGCTTT 1260 
ACAGAACTTG ATCTTGGTGA CCGTGTTATG AAACTGAATA CAGAGGTCAG AGATGTAGGA 1320 
CCICTAAGCA AAAAGGGATT TTATCTTGCT TTTCAAGATG TTGGTGCTTG CAT1GCTC1G 1380 
GTTTCTGTGC GTGTATACTA TAAAAAATGC OC1TCTOTOO TACGACACTT GGCTGTCTTC 1440 

60 CCTGACACCA TCACTGGAGC TGATTCTTCC CAATTGCTOG AAGTGTCAGG CTCCT3GTGTC 1500 
AAOCATTXTTG TOACCGATGA ACCTCCCAAA ATGCACTGCA GCGCCGAAGG GGAGTGGCTG 1560 
GTGCCCATCG GGAAATGCAT GTGCAAGGCA GGATATGAAG AGAAAAATGG CACCTGTCAA 1620 
GTGTGCAG AC CTGGGTTCTT CAAAGCCTCA CCTCACATCC AGAGCTGCGG CAAATGTCCA 1680 
CCTCACAGTT ATACCCATGA GG AAGCTTCA ACCTCTTGTG TCTGTGAAAA GGATTATTTC 1740 

65 AGGAGAGAGT CTGATCCAOG CACAATGGCA TGCACAAGAC OCCCCTCTGC TCCTOGGAAT 1800 
GCCATCTCAA ATGTTAATGA AACTAGTGTC TTTCTGGAAT GGATTCCGCC TGCTG ACACT I860 
GGTGGAAGGA AAGACGTGTC ATATTATATT GCATGCAAGA AGTGCAACTC CXATGCAGGT 1920 
GTGTGTGAGG AGTGTGGCGG TCATGTCAGG TACCTTCCCC GGCAAAGCGG CCTGAAAAAC 1980 
ACCTCTGTCA TGATGGTGGA TCTACTCGCT CACACAAACT ATACCTTTGA GATTGAGGCA 2040 

70 GTGAATGGAG TGTCCGACTT GAGCCCAGGA GCCCGGCAGT ATGTGTCTGT AAATGTAACC 2100 
ACAAATCAAG CAGCTCCATC TCCAGTCACC AATGTGAAAA AAGGGAAAAT TGCAAAAAAC 2160 
AGCATCTCTT TGTCTTGGCA AGAACCAG AT CGTCCCAATG GAATC ATCCT AG AGTATG AA 2220 
ATGAAGCATT TTG AAAAGGA GCAAGAGACC AGCTACACGA TTATCAAATC TAAAGAGACA 2280 
ACTATTACTG CAGAGGGCTT GAAACCAGCT TCAGTTTATG TCTTCCAAAT TCGAGCACGT 2340 

75 ACAGCAGCAG GCTATGGTGT CTTCAGTCGA AGA7TTGAGT TTGAAACCAC CCCAGTGT7T 2400 
GCAGCATCCA GCGATCAAAG CCAGATTGCT GTAATTGCTG TGTCTCTG AC AGTAGGAGTC 2460 
ATTTTGTTGG CAGTGGTTAT CGGCGTCCTC CTCAGTGGAA GTTGCTGCGA ATGTGGCTGT 2520 
GGGAGGGCTT CTTCCCTGTG CGCTGTTGCC CATCCAATCC TAATATGGCG GTGTGGCTAC 2580 
AGCAAAGCAA AACAAGATCC AG AAGAGG AA AAG ATGCATT TTCATAATGG GC ACATTAAA 2640 
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CTGCCAGGAG TAAGAACTTA CATTGATCCA CATACCTATG AGGATCCCAA TCAAGCTQTC 2700 
CACXjAATTTG CCAAGGAGAT AGAAGCATCA TGTATCACCA TTGAGAGAGT TATTGGAGCA 2760 
GGTGAATTTG GTGAAGTTTG TAGTGGACGT TTGAAACTAC CAGGAAAAAG AGAATTACCT 2820 
GTGGCTATCA AAACCCTTAA AGTAGGCTAT ACTGAAAAGC AAOGCAGAGA TTTCCTAGGT 2880 
GAAGCAAGTA TCATGGGACA GTTTGATCAT CCTAACATCA TCCATTTAGA AGGTGTGGTG 2940 
AOCAAAAGTA AACCAGTGAT GATCGTGACA GAGTATATGG AGAATGGCTC TTTAGATACA 3000 
1 1 1 1 1U AAGA AAAACGATGG GCAOTTCACT GTGATTCAGC TTGTTGGCAT GCTOAGAGGT 3060 
ATCTCTGCAG GAATGAAGTA CCTTTCTGAC ATGGGCTATG TGCATAGAGA TCTTGCTGCC 3120 
AG AAACATCT TAATCAACAG TAACCTTGTO TGCAAAGTGT CTGACTTTGG ACTTTCCOGG 3180 
GTACTGGAAG ATGATCOOGA GGCAGCCTAC ACCACAAGGG GAGGAAAAAT TCCAATCAG A 3240 
TGGACTGCCC CAGAAGCAAT AGCTITCCGA AAGTTTACTT CTGCCAGTGA TGTCTGGAGT 3300 
TATGGAATAG TAATGTGGGA AGTTGTGTCT TATGGAGAGA GACCCTACTG GGAGATGACC 3360 
AATCAAG ATG TGATTAAAGC GGTAGAGGAA GGCTATCGTC TGCCAAGCCC CATGGATTGT 3420 
CCTGCTGCTC TCTATCAGTT AATGCTGGAT TGCTGGCAGA AAGAGCGAAA TAGCAGGCCC 3480 
AAGTTTGATG AAATAGTCAA CATGTTGGAC AAGCTGATAC GTAACCCAAG TAGTCTGAAG 3540 
ACGCTGGTTA ATGCATCCTG CAG AGTATCT AATTTATTGG CAGAACATAG COCACTAGGA 3600 
TCTGGGGCCT ACAGATCAGT AGGTGAATGG CTAG AGGCAA TCAAGATGGG CCGGTATACA 3660 
GAGATTTTCA TGGAAAATGG ATACAGTTCA ATGGACGCTG TGGCTCAGGT GAOCTTGGAG 3720 
GATTTGAGAC GGCTTGGAGT GACTCTTOTC GGTCAOCAGA AGAAGATCAT G AACAGCCTT 3780 
CAAGAAATGA AGGTGCAGCT GGTAAACGG A ATGGTGCCATTCTAACTTCA TGTAAATGTC 3840 
GCTTCTTCAA GTGAATGATT CTGCACTTTG TAAACAGCAC TGAGATTTAT TTTAACAAAA 3900 
AAA 



SCO 10 MQxiaaPFHSPmteinseotiftnee! 
Protein Accession ft CAA64700.1 



1 11 , 21 31 41 51 
I I I I I I 

MRGSGPRGAG HRRPPSGGGD TPITPASLAG CYSAPRRAPL WTCLLLCAAL RTLLASPSNE 60 
VNLLDSRTVM GDLGWIAFPK NGWESGEVD ENYAPIHTYQ VCKVMEQNQN NWLLTSWISK 120 
EGASRIFIEL KFTLRDCNSL PGGLGTCKET FNMYYFESDD QNGRNIKENQ YUCIDTIAAD 180 
ESFTELDLGD RVMKLNTEVR D VGPLS KKGF YLAPQDVG AC IALVSVRVYY KKCPS WRHL 240 
AVFPDTITGA DSSQLLEVSG SCVNHSVTDE PPKMHCSAEG EWLVPIGKCM CKAG YEEKNG 300 
TCQVCRPGFF KASPfflQSCG KCPPHSYTHE EASTSCVCEK DYFRRESDPP TMACTRPPSA 360 
PRNAISNVNE TS VFLEWIPP ADTGGRKDVS YYIACKKCNS HAGVCEECGG HVRYLPRQSG 420 
LKNTS VMM VD LLAHTNYTFE EAVNGVSDL SPGARQYVSV NVTTNQAAPS PVTNVKKGKI 480 
AKNSISLSWQ EPDRPNGDL EVEDCHFEKD QETS YTUKS KETTITABGL KPASVYVFQI 540 
RARTAAGYGV FSR RFEFETT PVFAASSDQS QIPVIAVS VT VGVUXAWI GVLLSGSCCE 600 
CGCGRA5SLC AVAHPQJWR CGYSKAKQDP EEEKMHFHNG HKLPGYRTY IDPHTYEDPN 660 
QAVHEFAKEI EASOTIERV IGAGEPGEVC SGRLKLPGKR ELPVAHCTLK VGYTEKQRRD 720 
FLGEASIMGQ FDHPNIIHLE GWTKSKPVM IVTEYMENGS LDTFIKKNDG QFTVIQLVGM 780 
LRG1S AGMKY LSDMGYVHRD LAARNIUNS KLVCKVSDFG LSRVLEDDPE AAYTTRGGKI 840 
PIRWTAPEAI AFRKFTSASD VWSYGIVMWE WSYGERPYW EMTNQDVKA VEEGYRLPSP 900 
MDCPAALYQL MLDCWQKERN SRPKFDETVN MLDKURNPS SLKTLVNASC RVSNLLAEHS 960 
PLGSGAYRS V GEWLEAIKMG RYTEIFMENG YSSMDAVAQV TLEDLRRLGV TLVGHQKKIM 1020 
KSLQEMKVQL VNGMVPL 



SEQ ID NOT139 PFH2 DNA SEQUENCE 

Nucleic Add Accession ft NM_Q 16029 

Coding sequence: 78-1097 (unoerttaed sequences correspond to start and stop codons) 
1 11 21 31 41 51 

CTGCGATCCC GCAGGGCAGC GACGCGACTC TGGTGCGGGC OG TCT TC T TC CCCCCGAGCT 60 
GGGCCTGCGC GGCCGCAATG AACTGGGAGC TGCTGCTGTG GCTOCTGGTG CTGTGCGCGC 120 
TGCTOCTGCT CTTGGTGCAG CTGCTGCGCT TCCTGAGGGC TGACGGCGAC CTGACGCTAC 180 
TATGGGCCGA GTGGCAGGGA CGACGCCCAG AATGGGAGCT GACTGATATG GTGGTGTGGG 240 
TGACTGGAGC CTCGAGTGGA ATTGGTGAGG AGCTGGCTTA CCAGTTGTCT AAACTAGGAG 300 
TTTCTC1 1U T GCTGTCAGOC AGAAGAGTGC ATGAGCTGGA AAGGGTGAAA AGAAGATGOC 360 
TAGAG AATGG CAATTTAAAA GAAAAAG ATA TACTTGTTTT GCCCCTTGAC CTGACCGACA 420 
CTGGTTCCCA TGAAGCGGCT ACCAAAGCTG TTCTCCAGGA GTTTGGTAGA ATCGACATTC 480 
TGGTCAACAA TGGTGGAATG TCOCAGCGTT CTCTGTOCAT GGATACCAGC TTGGATGTCT 540 
ACAGAAAGCT AATAGAGCTT AACTACTTAG GGAOGGTGTC CTTGACAAAA TGTGTTCTGC 600 
CTCACATGAT CGAGAGGAAG CAAGGAAAG A TTGTTACTGT GAATAGCATC CTGGGTATCA 660 
TATCTGTACC TCTTTCCATT GGATACTGTG CTAGCAAGCA TGCTCTCCGG GGTTTTTTTA 720 
A7GGCCTTGG AAC AGAACTT GCCACATACC CAGGTATAAT AGTTTCTAAC ATTTGCCCAG 780 
GACCTGTGCA ATCAAATATT GTGGAGAATT (XCTAGCTGG AGAAGTCACA AAGACTATAG 840 
GCAATAATGG AGACCAGTCC CACAAGATGA CAACCAGTCG TTGTGTGCGG CTGATGTTAA 900 
TCAGCATGGC CAATGATTTG AAAGAAGTTT GGATCTCAG A ACAACCTTTC TTGTTAGTAA 960 
CATATTTGTG GCAATACATG CCAAOCTGGG OCTGGTGOAT AACCAACAAG ATGG GGAA GA 1020 
AAAGG ATTG A G AACTTTAAG AGTGGTGTGG ATGCAG ACTC TTCTTATTTT AAAATCTTTA 1080 
AGACAAAACA TGACI£2AAAA GAGCACTTGT ACTTTTCAAG CCACTGGAGG GAGAAATGGA 1140 
AAACATG AAA ACAGCAATCT TCTTATGCTT CTG AATAATC AAAGACTAAT T7GTG ATTTT 1200 
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ACTTTTTAAT AGATATGACT TTGCTTCCAA CATGGAATGA AATAAAAAAT AAATAATAAA 1260 
AGATTGCCAT G AATCTTGCA AA 



SEQ ID NO;140PFH2 Protein, geguence; 
Prcto Accession #: NP_0571iai 

1 11 21 31 41 51 
I I I I I I 

MNWELLLWLL VLCALLLLLV QLLRFLRADG DLTLLWAEWQ GRRPEWELTD MWW VTGASS 60 
GIGEELAYQL SKLGVSLVLS ARRVHELERV KRRCLENGNL KEKDILVLPL DLTDTGSHEA 120 
ATKAVLQEFG RIDILVNNGG MSQRSLCMDT SLDVYRKLIE LNYLGTVSLT KCVLPHMIER 180 
KQGKIVTVNS ILGHSVPLS IGYCA5KHAL RGFFNGLRTE LATYPGUVS N1CPGPVQSN 240 . 
IVENSLAGEV TKTIGNNGDQ SHKMTTSRCV RLMUSMAND LKEVWISEQP FLLVTYLWQY 300 
MPTWAWWITN KMGKKR1ENF KSGVDADSSY FKIFKTKHD 



SEQ ID N0:141 PFH1 DNA SEQUENCE 

r^eic Add Accession* NM.021614 
; Coding sequence: 1-1740 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 SI 
I lit I 1 

AJjGAGCAGCT GCAGGTACAA CGGGGGCGTC ATGCGGCCGC TCAGCAACTT GAGCGCGTCC 60 
CGOCGGAACC TGCACGAGAT GGACTCAGAG GOGCAGCCCC TGCAGOXCC CGCGTCTGTC 120 
GGAGGAGGTG GCGGCGCGTC CTCCCCGTCT GCAGCCGCTG CCGCCGCCGC CGCTGTTTCG 180 
TCCTCAGCCC CCGAGATCGT GGTGTCTAAG CCCG AGCACA ACAACTCCAA CAACCTGGCG 240 
CTCTATGGAA CCGGCGGCGG AGGCAGCACT GGAGGAGGCG GCGGCGGTGG CGGG AGCGGG 300 
CACGGCAGCA GCAGTGGC AC CAAGTCCAGC AAAAAGAAAA ACCAGAACAT CGGCTACAAG 360 
CTGGGCCACC GGCGCGCCCT GTTCGAAAAG CGCAAGCGGC TCAGCGACTA CGCGCTCATC 420 
TTCGGCATGT TCGGCATCGT GGTCATGQTC ATCGAGACCG AGCTGTCGTG GGGCGCCTAC 480 
GACAAGGCGT CGCTGTATTC CTTAGCTCTG AAATGCCTTA TCAGTCTCTC CACG ATCATC 540 
CTGCTOGGTC TGATCATCGT GTACCACGCC AGGGAAATAC A OT 1 GTTC AT GGTGGACAAT 600 
GGAGCAGATG ACTGG AGAAT AGCCATGACT TATGAGCGTA TTTTCTTCAT CTGCTTGGAA 660 
ATACTGGTGT GTGCTATTCA TCCCATACCT GGGAATTATA CATTCACATG GACGGCCCGG 720 
CTTGCCTTCT CCTATGCCCC ATCCACAACC ACCGCTGATG TGGATATTAT TTTATCTATA 780 
CCAATGTTCT TAAGACTCTA TCTGATTGCC AGAGTCATGC TTTTACATAG CAAA C1 ITTC 840 
ACTGATGCCT CCTCTAGAAG CATTGGAGCA CTTAATAAGA TAAACTTCAA TACACGTTTT 900 
GTTATG AAG A CTTTAATG AC TATATGCCCA GG AACTGTAC 7XTTGGTTTT TAGTATCTCA 960 
TTATGGATAA TTGCCGCATG GACTGTCCGA GCTTGTGAAA GGTACCATGA TCAACAGG AT 1020 
GTTACTAGCA ACTTCCTTGG AGCGATGTGG TTGATATCAA TAACTTTTCT CTCCATTGGT 1080 
TATGGTGACA TGGTACCTAA CACATACTGT GG AAAAGGAG TCTGCTT ACT TACTGG AATT 1140 
ATGGGTGCTG GTTGCACAGC CCTGGTGGTA GCTGTAGTGG CAAGGAAGCT AGAACTTACC 1200 
AAAGCAGAAA AACACGTGCA CAATTTCATG ATGGATACTC AGCTGACTAA AAGAGTAAAA 1260 
AATGCAGCTG CCAATGTACT CAGGGAAACA TGGCTAATTT ACAAAAATAC AAAGCTAOTG 1320 
AAAAAGATAG ATCATGCAAA AGTAAGAAAA CATCAACGAA AATTCCTGCA AGCTATTCAT 1380 
CAATTAAGAA GTGTAAAAAT GGAGCAGAGG AAACTGAATO ACCAAGCAAA CACTTTGGTG 1440 
GACTTGGCAA AGACCCAGAA CATCATGTAT GATATG ATTT CTGACTTAAA CGAAAGGAGT 1500 
GAAGACTTCG AGAAG AGGAT TGTTACCCTG GAAACAAAAC TAGAGACTTT GATTGGTAGC 1560 
ATCCACGCCC TCCCTGGGCT CATAAGCCAG ACCATCAGGC AGCAGCAGAG AGATTTCATT 1620 
GAGGCTCAG A TGGAG AGCTA CG ACAAGCAC GTCACTTACA ATGCTGAGCG GTCCCGGTCC 1680 
TCGTCCAGGA GGCGGCGGTC CTCTTCCACA GCACCACCAA CTTCATCAGA GAGTAGCJAJ2 



g^ffH^rTHIPBWntWWriM 
Protein Accessions NP.067627 

1 11 21 31 41 51 
I I I I I I 

MSSCRYNGGV MRPLSNLSAS RRNLHEMDSE AQPLQPPAS V GGGGGASSPS AAAAAAAAVS 60 
SSAPETWSK PEHNNSNNLA LYGTGGGGST GGGGGGGGSG HGSS5GTKSS KKKNQNIGYK 120 
LGHRRALFEK RKRLSDYAU PGMPGIWMV IETELSWOAY DKASLYSLAL KCUSLSTU 180 
LLGLUVYHA REIQLFMVDN G ADD WR1AMT YERIFFICLE ILVCAIHPIP GNYTFTWTAR 240 
LAFS YAPSTT TADVDIILSI PMFLRLYLIA RVMLLHSKLF TDASSRSIGA LNKINFNTRF 300 
VMKTLMTICP GTVLLVFSIS LWJJAAWTVR ACERYHDQQD VTSNFLGAMW LISITFLSIG 360 
YGDMVPNTYC GKGVCLLTGI MGAGCTALV V AVVARKLELT KAEKHVHNFM MDTQLTKRVK 420 
NAAANVLRET WUYKNTKLV KKIDHAKVRK HQRKFLQAIH QLRS VKMEQR KLNDQANTLV 480 
DLAKTQNIMY DM1SDLNERS EDFEKRIVTL ETKLETLIGS IHALPGUSQ TIRQQQRDH 540 
EAQMESYDKH VTYNAERSRS SSRRRRSSST APPTSSESS 



SEQ ID Nft143 PFG9 DNA SEQUENCE 
Nuctefc Acid Accession*: AL1 10139, coinfl region Is FGENESH proficted 
Cooing sequence: 1-1896 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
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MfiCGCGCCG TGCCGCTGCC CGCCCCGCTC CTGCCGCTCC TGCTGCTCGC GCTCCTGGCC 60 
GCTXCCGCCG CCCGCGCCAQ CAGAGCCGAG TCCGTCTCCG CCCCGTGGCC CGAACCCGAG 120 
CGCGAGTCGC GGCCACCGCC CGGCCCGGGG CCCGGGAACA CCACCCGGTr TGGGTCTGGG 180 
GCGGCGGGCG GCAGCGGCAG CTCCAGCTCC AACAGCAGTG GCGACGCCTT GGTGACCCGC 240 
ATTTCCATCC TCCTCCGCGA CCTACCCACC CTCAAGGCAG CCGTGATCGT GGCGTTCGCC 300 
TTTACCACCC TCCTCATCGC CTGCCTGCTG CTGCGCGTCT TCAGGTCGGG AAAGAGGTTA 360 
AAGAAGACAC GCAAGTATGA TATCATCAOC ACTOCAGCAG AGCGAGTGGA AATGGCGCCA 420 
CTAAATGAAG AGGATGATGA AGATGAGGAC TCCACAGTAT TCGACATCAA ATACAGAGTG 480 
TCCTTGCCGG CTGCACTG AG ACGTCAGCTG CCAGGGTGCC AGACGCTACT GACAGTTOCT 540 
GTGOCCCCAC CCnCATCCT CG ACATTGAC CTTCCAGCAA GATGCAGTGG AAGGCCTGAT 600 
GGTGGAATCA GACCTGGTAA AACCTGTTTC CCAGCCTGGT GGCATCCTGT GGAAAGTTGG 660 
TCAGCTGCAA CCTGGGGTGT GAAGGACTGG ACCTGGAAGC CCTCTTGCGT CGGAGGTGTT 720 
GAAACCAAAA CG AACGTTAT GTATAAAACC CCAGCTCCAT CGTGCGTGTC AGGCATCTGC 780 
TCAGACTGTC ACTGGCAAGC TCGTTTCCAC OTCACCACAA TGGAGTTGCT TCTGCCACCC 840 
TTTGGGCATC CCTTTAAAGT GCCCCCTACT TCTACTCCCC ATGGTTTTCG ACAACTGCAG 900 
CTGAATCTCA TGGAAAAGCT GGATTCCTCT GCCTTACGCA G AAACACCCG GGCTCCATCT 960 
GCCAGGTGCT TGCCACTGGT CCTGGCAGAA ATGGCGGCTG CTGAAAGTGA CCTTCCAAAT 1020 
CCT7GGTGGC ACTTCAGCGC CACAGGCTCT OCAATAAAAA CCCTTTACAC ACAAACCATG 1080 
AGTACCTTGG GCTTGGATGT TTTCTGTGGT GCCGGCCAGC GGGGCACCTT TIGTG AAGAC 1140 
AGAGCAGTGA CTAAGGTTCT CCAGGGTAGC TC1TTCTOCA AACAGCTGCG CTGGAAGCCA 1200 
GCCCTAG AGA GTGGGTTTCC CCATCATCTC AGGCTTCTCA GAGAGTGTCC TOOGCTGAGC 1260 
ACCCATCCTG TCAGGTTGGC TCGTTCAGAT GCCCGGGGAC AAGCCAGCCT GACGGGGAGG 1320 
AGGGTGTTTC GGCGTCCGCG GCAGTCTCTG CATGGCGGAG GGTCAGCGGG TACCGCAACT 1380 
TGC C1 1 1 IUG TTTTGAAGAT TCTGTTGAGG CGCCATOCTC ACCTTGACCT CTTCTACAAA 1440 
ATO G TCTCC OCTGCTGTGC OGTGGAACAC CTACGGGAAG GCAAGAGAAO CTCAGTGACT 1500 
GTCCTTGCGT CATTTGAGCA GAGCCCACAA AAGGCAGCTG CTGCCCACGG GGAGCCTGTC 1560 
AAACGAGGGC CCAGTGGGCA ATTGACCAG A CACACATGCC CTGGCTGGGG GATCACACAT 1620 
GCGAAOCTGC AGACAATTOC AGATACCCAA GGCCAGGAAG GCCCACGTGA GG ATGTCACT 1680 
CACCCTGGAG GAGACTTGGA TGGGGTGGCA AATTTCTATT TGGAGGAAGA GGGTTTCCAG 1740 
GATGGCAGAT GCCAGAAGAT GGTCCTGATG TCTGAGGAAG GGCCACCTAG TTTGACAGGA 1800 
TGTGAGAGGC TCACAGGTTC CCATCACITC TtXAGCCATT CCAAGTCTTG GTCCTTCCTT 1860 
TCCCCCCGAC AGCCCCTGTTTCTGTCCAGG CCCIQA 



SEP ID WO:144 PFG9 Protein seouencg 

Protein Accession #: none available, FGENESH predicted 

1 11 21 31 41 51 

MRAVPLPAPL LPLLLLALLA APAARASRAE SVSAPWPEPE RESRPPPGPG FGNTTRFGSG 60 
AAGGSGSSSS NSSGDALVTR ISILLRDLPT UCAAVIVAFA FTTLLIACLL LRVFRSGKRL 120 
KKTRKYDHT TPAERVEMAP LNEEDDEDED STVFDKYRV SLPAALRRQL PGCQTLLTVP 180 
VPPPFUDID LPARCSGRPD GGIRPGKTCFPAWWHPVESW SAA1WGVKDW TWKPSCVGGV 240 
BTKTNVMYKT PAPSCVSGIC SDCHWQARFH VTTMELLLPP FGHPFKVPPT STPHGFRQLQ 300 
LNLMEKLDSS ALRRNTRAPS ARCLPLVLAE MAAAESDLFN PWWHFS ATGS PKTLYTQTM 360 
STLGLDVPCG AGQRGTPCED RAVTK VLOGS SFSKQLRWKP ALESGFPHHL RLLRECPPLS 420 
THPVRLARSD ARGQASLTGR R VFRRPRQSL HGGGS AGTAT CLLVLKILLR RHPHLDLFYK 480 
ICLPCCAVEH LREAKRSSVT VLASFEQSPQ KAAAAHGEPV KRGPSGQLTR HTCPGWGTra 540 
ANLQTIPDTQ GQEGPREDVT HPGGDLDGVA NFYLEEEGFQ DGRCQKMVLM SEEGPPSLTG 600 
CERLTGSHHF SSHSKSWSFL SPRQPLFLSR P 



SEQ ID N0:145 PFG6 DMA SEQUENCE 

Nudeic Acid Accession*: NM_013427 

Coding sequence: 87W7B9 (undeitoed sequences correspond to start and stop codons) 
1 II 21 31 41 51 

GGCTGGGCTG CGAATAGCGT GTTCCTCTCC GGCGGAACAC ACACACOCGG CCTTGGGGCT 60 
GTCTCCTGAA GCTCCCTCCT OCACGGAGAG CGCTGAGCGC CGCCGGG AAT TCCATCCCAC 120 
CGTGGGCACG CAGTCTTTGG AGGTCCCGGG CGCAGCACGC TCGGTGTCCC CACACTGCAG 180 
CAAGACAGAG ACCCCGCGGG AACCTTOAGC TTGGAACAAC CCTTGAGCCT CTGCAGTCGG 240 
AAGAGTGGGC GCAGCAGCCX: AGCGGAGGCC AGGCGCGCAA CCTCGGGCGC CGGGGCAAGG 300 
AGAGAGTGCA GGGAGGCGCA GCTCAGGCGC CCGGCTCAGG AGCGGGAGGA AGTTCTCGCG 360 
GCGCCGGGAG CGCXK3TGGAC GCGCCCTGGG CGCACGCCCA GGCAGCCTTC TCCCTGGCCC 420 
TOGGGACTGT CCTOGGGCGG CAAGGAGGAG CTTGCTGGAG TCTTAGAGGC CATCCAGAGC 480 
CAGCGAGCAG GAGCGCTGCG TCTCCCGCCT CAGCTAGG AA GGGGGAGTGG CGCTGGCAGG 540 
CTGGAGCTGG GAAGCCAGCG AGCGCCTGAC CTTCCTCCTC CTCTTCCTGA CCCTCTICGC 600 
GTCTTGGGCT CCGGAGGAAG GTTCTAGCGG CTGCAGGAGG TCCCCAGACC CATTTTCCTA 660 
GAAGGCTGGT G ATGGATCTG CTGCTCCTGC CGCCGCCGGG GCACTTGGAG CGCACCGGCG 720 
GCGCGTGAGC TGGGCTTTGC TCTCCACCGC CCTGGGCAAA CCCCGGGCCA GCCCCGCCTG 780 
GCACCTITGC CTGAGTCCCT TTCGGTTCCC GACCCAAAGC CACCAGCGTC CAGGGAGGGA 840 
GGAGGAGGTG GTCCTCAGGT GCAGCCCOGC CGAGAJjGTCC GCGCAGAGCC TGCTCCACAG 900 
CGTCTTCTCC TGTTOCTCGC C5CGCTTCAAG TAGCGCGGCXT TCGGCCAAGG GCTTCTCCAA 960 
GAGGAAGCTG CGCCAG ACCC GCAGCCTGG A OGCGGCCCTG ATCGGCGGCT GCGGG AGCG A 1020 
CGAGGOGGGC GCGGAGGGCA GTGCGCGGGG AGCCACGGCG GGCCGCCTCT ACTCCCCATC 1080 
ACTCCCAGCC GAGAGTCTCG GCCCTCGCTT GGCCTCCTCT TCCCGGGGTC CGCCCCCCAG 1140 
GGCCACCAGG CTACCGCCTC CTGGACCTCT TTGCTCGTCC TTCTCCACAC CCAGCACCCC 1200 
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GCAGG AG AAG TCACCATCCG GCAGCTTTCA CTTTCACTAT GAGGTTCCCC TGGGTCGCGG 1260 
CGGCCTCAAG AAGAGCATGG CCIX3GGACCT GCCITCTGTC CTGGCCGGGC CAGCCAGTAG 1320 
CCGAAGCGCT TCCAGCATCC TCTGTTCATC CGGGGGAGGC CCCAATGGCA TCTTCGCTTC 1380 
TCCTAGGAGG TGGCTCCAGC AGAGG AAGTT CCAGTOCCCA CCCGACAGTC GCGGGCACCC 1440 
CTACGTCGTG TGO AAATCCG AGGGTGATTT CACCTGQAAC AGCATGTCAG GCCGCAGTGT 1500 
GCGGCTGAGG TCAGTCCOCA TCCAG AGTCT CTCAGAGCTG GAGAGGGCCC GGCTGCAGGA 1560 
AGTGCCTTTT TATCAGTTGC AACAGGACTG TGACCTGAGC TGTCAGATCA CCATTCCCAA 1620 
AGATGG AC AA AAG AGAAAGA AATCTTTAAG AAAGAAACTG GATTCACTAG G AAAGG AG AA 1680 
AAACAAAGAC AAAGAATTCA TCCCACAGGC ATTTGGAATG CCCTTATCCC AAGTCATTGC 1740 
GAATG ACAGG GCCTATAAAC TCAAGCAGGA CTTGCAG AGG G ACG AGCAG A AAGATGCATC 1800 
TGACTTTGTG GCTTCCCTCC TCCCATTTGG AAATAAAAGA CAAAACAAAG AACTCTCAAG 1860 
CAGTAACTCA TCTCTCAGCT CAACCTCAGA AACACCGAAT GAGTCAACGT CCCCAAACAC 1920 
CCCGGAACCG GCTCCTCGGG CTAGGAGGAG GGGTGCCATG TCAGTGGATT CTATCACCGA 1980 
TCTTGATGAC AATCAGTCTC GACTACTAGA AGCTTTACAA CTITCCTTGC CTGCTGAGGC 2040 
TCAAAGTAAA AAGG AAAAAG OCAGAG ATAA GAAACTCAGT CTGAATCCTA TTTACAG ACA 2100 
GGTCCCTAGG CTGGTGGACA GCTGCTGTCA GCACCTAOAA AAACATGGCC TCCAGACAGT 2160 
GGGGATATTC C GAGTTGGA A GCTCAAAAAA O AGA GTGAGA CAATTACGTG AGGAATTTGA 2220 
CCGTGGGATT GATGTCTCTC TGGAGGAGGA GCACAGTGTT CATG ATGTGG CAGCCTTGCT 2280 
GAAAGAGTFC CTG AGGG ACA TGCCAG ACCC CCTTCTCACC AGGG AGCTGT ACACAGCTTT 2340 
CATCAACACT CTCTTGTTGG AGCCGG AGGA ACAGCTGGGC ACCTTGCAGC TCCTCATATA 2400 
CCTXCTACCT CCCTGCAACT GCGACACCCT CCACCGCCTG CTACAGTTCC TCTCCATCGT 2460 
GGCCAGGCAT GCCGATGACA ACATCAGCAA AGATGGGCAA GAGGTCACTG GGAATAAAAT 2520 
GACATCTCTA AACTTAGCCA CCATATTTGG ACCCAACCTG CTGCACAAGC AGAAGTCATC 2580 
AGACAAAGAATTCTCAGTrc AGAGTTCAGC CCGGGCTGAG GAGAGCACGG CCATCATCGC 2640 
TGTTGTGCAA AAGATGATTG AAAATTATGA AGCCCTGTTC ATGGTTCCCC CAGATCTGCA 2700 
GAACGAAGTG CTGATCAGCC TGTTAGAGAC CGATCCTGAT GTCGTGG ACT ATTTACTCAG 2760 
AAGAAAGGCT TCCCAATCAT CAAGCCCTG A CATGCTGCAG TCGG AAGT7T CCTTTTCCGT 2820 
GGG AGGGAGG CATTCATCTA CAGACTCCAA CAAGGCCTCC AGCGGAGACA TCTCCCCTTA 2880 
TGACAACAAC TOCCCAGTGC TOTCTGAGCG CTCCCTGCTG GCTATGCAAG AGGACGCGGC 2940 
CCCGGGGGGC TCGGAGAAGC TTTACAGAGT GCCAGGGCAG TTTATGCTGG TGGGCCACTT 3000 
GTCGTCGTCA AAGTCAAGGG AAAGTTCTCC TGG ACCAAGG CTTGGGAAAG ATCTGTCAGA 3060 
GGAGCCTTTC G ATATCTGGG GAACTTGGCA TTCAACATTA AAAAGCGGAT CCAAAG ACCC 3120 
AGG AATGACA GGTTCCTCTG GAGACATTTT TGAAAGCAGC TCCCTAAGAG CGGGGCCCTG 3180 
CTCCCTTTCT CAAGGG AACC TOTCCCCAAA TTGGCCTCGG TGGCAGGGGA GCCCCGCAGA 3240 
GCTGG ACAGC G ACACGCAGG GGGCTCGGAG GACTCAGGCC GCAGCCCCCG CGACGG AGGG 3300 
CAGGGCCCAC CCTGCGGTGT CGCGCGCCTG CAGCACGCCC CACGTCCAGG TGGCAGGG AA 3360 
AGCCGAGCGG CCCACGGCCA GGTDGGAGCA GTACTTGACC CTG AGCGGCG CCCACG ACCT 3420 
CAGCGAGAGT GAGCTGGATG TGGCCGGGCT GCAGAGCCGG GCCACACCTC AGTGCCAAAG 3480 
ACCCCATGGG AGTGGGAGGG ATGACAAGCG GCCCCCGCCT CCATACCCGG GCCCAGGGAA 3540 
GCCCGCGGCA GCGGCAGCCT GGATCCAGGG GCCCCCGGAA GGCGTGGAGA CACCCACGGA 3600 
CCAGGGAGGC CAAGCAGCCG AGCGAGAGCA GCAGGTCACG CAGAAAAAAC TGAGCAGCGC 3660 
CAACTCCCTG CCAGCGGGCG AGCAGGACAG TCCGCGCCTG GGGGACGCTG GCTGGCTCGA 3720 
CTGGCAGAGA GAGCGCTGGC AGATCTGGGA GCTCCTGTCG ACCGACAACC CCGATGCCCT 3780 
GCCCGAGACG CTGGTCTjQAG CCCGCACCCA GCCGAGCCCC CCCTGCCCCG AGCCCCCCGC 3840 
CCTCCAGCCC AGGGGGGACC GTGGGTGGTG GCCACTGGCA CACTTAGTGT TCTTCTTTCA 3900 
CACTTCTCAA AAGTGACACA AGAGAAATCC AGTTCAOCTA CAGAGGTAGA GCACTCACGC 3960 
COCCGCCATT GAGAATAAGG TTCCATTGCG TAGCCAGCCT TAGG AAAAAC AAACAG AACC 4020 
CAAACCAGAT GGCAATGTCC AATCTAAAAA CXTrCCCTCTT GGCTCTATAA TATAAGATAC 4080 
AACTCTTGCT TGGTATAGCC TAACCGTATT TATGTGTCTT CGGTTTTGAC TATTGTGTAT 4140 
TCTGTAACAG ATT ATGTATA ATCATATATG ATATATTCAC AAAGAGAAAA CAAAAGGAAC 4200 
TTTTAAAAAA AAAATCACTT CACTTATATT AAGCAATGAG ATATACTAAA CAATGAGATT 4260 
CTATAGAATG TTCTAGAATG TGCACAAGCG GGTTTCTGTG CTTTTGCCAT AGCTTTATAA 4320 
CTGGGGATAA CECTTCCTTC GATACCAAAC ACTAACAAGA GGAAGCAGAA TATGAG AAGC 4380 
CATATTTTTA CATAGGAGTC AGATACAAAA AGAAAAATCA CTGAATGCTT TTAGATATTG 4440 
AATACGTTTT CAGGAAAATG CTAAATCTGA TAGATTACGA AATATATTTT TAGAACTTGT 4500 
TTAGAAAGGA TTCAGTTAAC CAAACAAGAA AAAGGCAGTG CCTCACAAAG AAATTAAGAA 4560 
GTTGTCCGTC CCACCTTACA TCAAATTCAG TTTTATATAG GCCATATATA ATATATATTT 4620 
ATAATGTATA ATTTTTATGT ATTTTTCAAA ACTACAAACT GGAATCCAAC TATAAAGTGT 4680 
TTAAGAATCT ACACAG AATA TTCAAATTAT AGAACATGTT TTTTCCCTTT GCCCCATAAT 4740 
CAOTATTTGC CAAATTACAT GCAATTCCTT AAAAACTAAA TCACATTGGT AAAAGO CCTA 4800 
CAGCTTTGTA CTTACATTGT GCCAAAGGCT GAGGAAATGT TTTCTTTCGA ATTTTTATGT 4860 
GTATTGTAAA ATGTTCTACC GTACTTTAGT AGTTTG AAGT TTTCAAGTGC ATAACTATTT 4920 
TTGACCAGCA GAAGGCGATA CGCTTCAGTA TTTTATGCAA TTTTTTTTCA CTTCG AAGGG 4980 
AAAGTGTATT ATAAAAAAAG ATTTTTTTTT TTTAAAACAT GCTACTCTTA ATTTTCATGT 5040 
TGGTG ATGAA ATTCCCAGTG GTGTTTCTTA AGGTTCTATC TTGTGCCATG ATGAATAAAA 5100 
AGTTAAGCAA AAAAAAAAAA AAAAAAAAAA AAA 



SEQ ID Nfr146 PFG6 Prolan semifine 
Protein Accession*: NPJB8286.1 

1 11 21 31 41 51 

MSAQSUHS V FSCSSPASSS AASAKGFSKR KLRQTRSLDP ALIGGCGSDE AGAEGSARGA 60 
TAGRLYSPSL PAESLGPRLA SSSRGPPPRA TRLPPPGPLC SSESTPSTPQ EKSPSGSFHF 120 
DYEVPLGRGG LXKSMAWDLP SVLAGPASSR SASSILCSSG GGPNGIFASP RRWLQQRKPQ 180 
SPPDSRGHPY WWKSEGDFT WNSMSGRSVR LRSVPIQSLS ELERARLQBV PFYQLQQDCD 240 
LSCQITffKD GQKRKKSLRK KLDSU3KEKN KDKEFIPQAF GMPLSQVIAN DRAYKLKQDL 300 
QRDEQKDASD FVASLLPFGN KRQNKELSSS NSSLSSTSET PNESTSPNTP EPAPRARRRG 360 
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AMS VDSITDL DDNQSRLLEA LQLSLPAEAQ SKKEKARDKK LSLNPXYRQV PRLVDSCCQH 420 
LEKHGLQTVG IFRVGSSKKR VRQLREEFDR GIDVSLEEEH SVHDVAALLK EFLRDMPDPL 480 
LTRELYTAFI NTLLLEPEEQ LGTLQLUYL LPPCNCDTLH RLLQFLSIVA RHADDNISKD 540 
GQEVTGNKMT SLNTATIFGP NLLHKQKSSD KEFSVQSSAR AEESTAUAV VQKMDENYEA 600 
LFMVPPDLQN EVLLSLLETD PDWDYLLRR KASQSSSPDM LQSEVSFSVG GRHSSTDSNK 660 
ASSGDKPYD NNSPVLSERS LLAMQEDAAP GGSEKLYRVP GQFMLVGHLS SSKSRESSPG 720 
PRLGKDLSEB PFDIWGTWHS TLKSGSKDPG MTGSSGDffE SSSLRAGPCS LSQGNLSPNW 780 
PRWQGSPAEL DSDTQGARRT QAAAPATEGR AHPAVSR ACS TPHVQVAGKA ERPTARSEQV 840 
LTLSGAHDLS E5ELDVAGLQ SRATPQCQRP HGSGRDDKRP PPPYPGPGKP AAAAAWIQGP 900 
PEGVETPTDQ GGQAAEREQQ VTQKKLSS AN SLPAGEQDSP RLGDAGWLDW QRERWQIWEL 960 
LSTDNPD ALP ETLV 



SEQ ID N0:147 PFG4 DNA SEQUENCE 

Nucleic Acid Accession I: NM_0022Q2 

Coding sequence: 240-1289 (underlined sconces correspond to start and stop radons) 



1 11 21 31 41 51 
11(111. 

CCCCCGAGCC GCGCCGAGTC TGCCGCCGCC GCAGCGCCTC CGCTCCGCCA ACTCCGCCGG 60 
CTTAAATTGG ACTGCTAGAT CCGCGAGGGC GCGGCGCAGC CG AGCAGCGG CTCTTTCAGC 120 
ATTGGCAACC CCAGGGGCCA ATATTTCCCA CTTAGCCACA GCTCCAGCAT CCTCTCTGTG 180 
GGCTGTTCAC CAACTGTACA ACCACCATTT CACTGTGGAC ATTACTCCCT CTTACAGATA 240 
TGjGGAGACAT GGGAGATCCA CCAAAAAAAA AACGTCTGAT TTCCCTATGT GTTGGTTGCG 300 
GCA ATCAG AT TCACGATCAG TATATTCTGA GGGTTTCTCC GGATTTGGAA TGGCATGCGG 360 
CATGTTTGAA ATGTGCGGAG TGTAATCAGT ATTTGGACGA GAGCTGTACA TGCTTTGTTA 420 
GGGATGGGAA AACCTACTGT AAAAGAGATT ATATCAGGTT GTACGGGATC AAATGCGCCA 480 
AGTG CAGCAT CGGCTTCAGC AAGAACG ACT TCGTGATGCG TGCCCGCTCC AAGGTGTATC 540 
ACATCG AGTG TTICCGCTGT GTGGCCTGCA GCCGCCAGCT CATCCCTGGG G ACGAATTTG 600 
CGCTTCGGGA GG ACGGTCTC TTCTGCCGAG CAG ACCACGA TGTGGTGG AG AGGGCCAGTC 660 
TAGGCGCTGG CGACCCGCTC AGTCCCCTGC ATCCAGCGCG GCCACTGCAA ATGGCAGCGG 720 
AGCCCATCTC CGCCAGGCAG CCAGCCCTGC GGCCCCACGT CCACAAGCAG CCGGAGAAGA 780 
CCACCCGCGT GCGG ACTGTG CTGAACGAGA AGCAGCTGCA CACCTTGCGG ACCTGCTACG 840 
CCGCAAACCC GCGGCCAGAT GCGCTCATGA AGGAGCAACT GGTAGAGATG ACGGGCCTCA 900 
GTCCCCGTGT GATCCGGGTC TGGTTTCAAA ACAAGCGGTG CAAGGACAAG AAGCG AAGCA 960 
TCATGATGAA GCAACTCCAG CAGCAGCAGC CCAATGACAA AACTAATATC CAGGGGATGA 1020 
CAGG AACTCC CATGGTGGCT GCCAGTCCAG AG AG ACACG A GGGTGGCTTA CAGGCTAACC 1080 
CAGTGGAAGT ACAAAGTTAC CAGCCACCTT GGAAAGTACT G AGCG ACTTC GCCTTGCAGA 1140 
GTGACATAGA TCAGCCTGCT TTTCAGCAAC TGGTCAATTT TTCAGAAGGA GGACCGGGCT 1200 
CTAATTCCAC TGGCAGTGAA GTAGCATCAA TGTCCTCTCA ACTTCCAGAT ACACCTAACA 1260 
GCATGGTAGC CAGTCCTATT GAGGCATGAG GAACATTCAT TCTGTATTTT TTTTCCCTGT 1320 
TGGAGAAAGT GGGAAATTAT AATGTCGAAC TCTGAAACAA AAGTATTTAA COACCCAGTC 1380 
AATGAAAACT GAATCAAGAA ATGAATGCTC CATGAAATGC ACGAAGTCTG TTTTAATGAC 1440 
AAGGTOATAT GGTAGCAACA CTGTGAAGAC AATCATGGGA TTTTACTAGA ATTAAACAAC 1500 
AAACAAAACG CAAAACCCAG TATATGCTAT TCAATGATCT TAG AAGTACT GAAAAAAAAA 1560 
GACGTTTTTA AAACGTAG AG GATTTATATT CAAGGATCTC AAAG AAAGCA TTTTCATTTC 1620 
ACTGCACATC TAGAGAAAAA CAAAAATAGA AAATTTTCTA GTCCATCCTA ATCTGAATGG 1680 
TCCTGTTTCT AT ATTGGTCA TTGCCTTGCC AAACAGG AGC TCCAGCAAAA GCGCAGGAAG 1740 
AG AG ACTGGC CTCCTTGGCT GAAAGAGTCC TTTCAGG AAG GTGG AGCTGC ATTGGTTTGA 1800 
TATGTTTAAA GT TGACTTT A ACAAGGGGTT AATTGAAATC CTGGGTCTCTTG GCCTG TCC 1860 
TGTAGCTGGT TTATTTTTTA CTTTGCCCCC TCCCCACTTT TTTTGAGATC CATCCTTTAT 1920 
CAAGAAGTCT GAAGCGACTA TAAAGGTTTT TGAATTCAGA TTTAAAAACC AACT TATAAA 1980 
GCATTGCAAC AAGGTTACCT CTATTTTGCC ACAAGCGTCT CGGGATTGTG TTTGACTTGT 2040 
GTCTGTCCAA GAACTTTTCC CCCAAAGATG TGTATAGTTA TTGGTTAAAA TGACTGTTTT 2100 
CTCTCTCTAT GGAAATAAAA AGGAAAAAAA AAAGGAAACT TTTTTTGTTT GCTCTTCCAT 2160 
TGCAAAAATT ATAAAGTAAT TTATTATTTA TTGTCGGAAG ACTTGCCACT TTTCATGTCA 2220 
TTTG ACATTT TTTGTTTGCT GAAGTGAAAA AAAAAGATAA AGGTTGTACG GTGGTCTTTG 2280 
AATTATATGT CTAATTCTAT GTGTTTTGTC TTTTTCTTAA ATATTATGTG AAATCAAAGC 2340 
GCCATATGTA GAATTATATC TTCAGGACTA TTTCACTAAT AAACATTTGG CATAG AT 



SEQ TP N0:1« PFG4 Protein seouence 
Protein Accession*: NP_002 193.1 

1 II 2! 31 41 51 
I I I I I I 

MGDPPKKKRL ISLCVGCGNQ 1HDQYILRVS PDLEWHAACL KCAECNQYLD ESCTCFVRDG 60 
KTYCKRDym LYGIKCAKCS IGFSKNDFVM RARSKVYHE CFRCVACSRQ UPGDEFALR 120 
EDGLFCRADH DWERASLGA GDPLSPLHPA RPLQMAAEPI SARQPALRPH VHKQPEKTTR 180 
VRTVLNEKQL HTLRTCYAAN PRPDALMKEQ LVEMTGLSPR VKVWFQNKR CKDKKRSIMM 240 
KQLQQQQPND KTNIQGMTGT PMVAASPERH DGGLQANPVE VQSYQPPWKV LSDFALQSDI 300 
DQPAPQQLVN FSEGGPGSNS TGSEVASMSS QLPDTPNSMV ASPIEA 
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SEQ ID NO:149 PFG2 DNA SEQUENCE 

Nucleic Add Accession ft NM.001172 

Coding sequence: 39-1103 (underlined sequences correspond to start and stop codons) 

5 1 U 21 31 41 51 
I I I I I I 

GCGGAGCTCT GCCTTGGAGA TTCTCAGTGC TGCGGATCAXGTCCCTAAGG GGCAGCCTCT 60 
CGCGTCTOCT OCAG ACGCGA GTGCATTCCA TCCTGAAO AA ATOCGTCCAC TCOGTGGCTG 120 
TGATAGG AGC CCCGTTCTCA CAAGGGCAGA AAAGAAAAGO AGTGGAGCAT GOTCOCGCTG 180 

10 CCATAAGAGA AGCTGGCTIX} ATGAAAAGGC TCTCCAOTTT GGGCTGCCAC CTAAAAGACT 240 
TTGGAGATTT GAGTTTTACT CCAGTCCCCA AAGATGATCT CTACAACAAC CTGATAGTGA 300 
ATCCAOGCTC AGTGGGTCTT GCCAACCAGG AACTGGCTGA GGTGGTTAGC AG AGCTGTGT 360 
GAGATGGCTA CAGCTGTGTC ACACTGGGAG GAGACCACAG CCTGGCAATC GGTACCATTA 420 
GTGGCCATGC CCGACACTGC CCAGACCTTT GTG TIG TCTG GGTTGATGCC CATGCTOACA 480 

15 TCAACACACC CCTTACCACT TCATCAGGAA ATCTCCATGG ACAGCCAGTT TCATTTCTCC 540 
TCAGAGAACT ACAGG ATAAG GTACCAC AAC TCCCAGGATT TTCCTGGATC AAACCTTGTA 600 
TCTCTTCTGC AAGTATTGTG TATATTGGTC TG AG AGACGT GGAOCCTOCT GAACATTTTA 660 
TTTTAAAGAA CTATGATATC CAGTATTTTT CCATGAGAGA TATTGATCGA CTTGGTATCC 720 
AGAAGGTCAT GGAAOGAACA TTTGATCTGC TGATTGGCAA GAGACAAAG A CCAATOCATT 780 

20 TG AGTTTTGA TATTG ATGCA TTTG ACCCTA CACTGGCTCC AGGCACAGGA ACTCCTGTTG 840 
TCGGGGGACT AACCTATCGA G AAGGCATGT ATATTGCTGA GGAAATACAC AATACAGGGT 900 
TGCTATCAGC ACTGG ATCTT GTTGAAGTCA ATCCTCAGTT GGOCACCTCA GAGGAAGAGG 960 
CGAAGACTAC AGCTAACCTG GCAGTAGATG TGATTGCTTC AAGCTTTGGT CAGACAAGAG 1020 
AAGGAGGCCA TATTGTCTAT GACCAACTTC CTACTCCCAG TTCACCAGAT GAATCAGAAA 1080 

25 ATCAAGCACG TGTG AGAATT TAGGAGACAC TGTGCACTGA CATGTTTCAC AACAGGCATT 1140 
OCAGAATTAT G AGGCATTGA GGGGATAGAT G AATACTAAA TGGTTGTCTG GGTCAATACT 1200 
GCCTTAATGA GAACATTTAC ACATTCTCAC AATTGTAAAG TTTCGCCICT ATTTTGGTGA 1260 
CCAATACTAC TGTAAATGTA TTTGGTTTTT TGCAGTTCAC AGGGTATTAA TATGCTACAG 1320 

_ _ TACTATGTAA ATTTAAAGAA GTCATAAACA GCATTTATTA CCTTGGTATA TCATACTGGT 1380 

30 CTTGTTGCTG TTGTTCCTTC ACATTTAAGT GGTTTTTCAT CTTTCCTCCC TCCTCCCACA 1440 

GCCTGGCTAT ACAGTGCATC CTTGAACTGT CAGOCCACAG CAGCAATATG CTTATTCTAT 1500 
OCACATGCCT AACATCATGC ATTCACAAGG TCAAAGTTCT GGTCCACAAA OCCTTCOCTA 1560 
TAGAAGTTCA ATGGCTGCGA AAGAATTTGT AGTAAACCAG GCCTCCCAGG ATGGCGAGCT 1620 

__. CCAGTAAGAT G ATAATGG AA AGCAGCAGCT TGTTGGTTGT CACTCTACAA AGAGAAGCAA 1680 

35 AGTGGGGAGT AGTCAG AAGT TTGGATAACC TTCCTTCTAA ACATTTGGGG GTTAG ACCTG 1740 
GGAGCACGGC TGGATACTCT GAGGCTGTAT GTTTGATCAC ACAGCCACTT AGCAGGAAGT 1800 
ACTCATAAGG TTCTTTAGCT GTCACTTAGG GATAACACTG TCTACCTCAC AGAAATGTTA 1860 
AACTGAGACA ATAAAACCCA AAGCAT 

40 

SEQ ID NO:1S0 PFG2 Protein sequence 
Protein Accession ft NP_001 163.1 

1 11 21 31 41 51 
I I I I I I 

MSLRG5LSRL LQTR VHSHJC KSVHSVA VIG APFSQGQKRK GVEHGPAAIR EAGLMKRLSS 60 
LGCHLKDPGD LSFTPVPKDD LYNNLTVNPR SVG1ANQELA EWSRAVSDG YSCVTLGGDH 120 
SIAIGTISGH ARHCPDLCW WVDAHADINT PLTTSSGNLH GQPVSHXRE LQDKVPQLPG 180 
FSWIKPCISS ASIVY1GLRD VDPPEHFILK NYDIQYFSMR DIDRLGIQKV MERTFDLUG 240 
KRQRPIHLSFDIDAFDPTLA PATGTPWGG LTYREGMYIA EHHNTGLLS ALDLVEVNPQ 300 
LATSEEEAKT TANLA VD VIA SSFGQTREGG HTVYDQLPTP SSPDESENQA RVRI 



55 SEQ(DN0;151 PFG1 DNA SEQUENCE 

NudeJc Add Accession ft NM.017906 

Cooing sequence: 80-1255 (undedined sequences correspond to start and stop codons) 

i 11 21 31 41 51 

60 [ | | | | | 

AATTATATAT TTTTACTCTA 1 <j 1 1 1U1CTA CA 1UU 1 ll ' l ' 1C1 1 1LC GTT GCTGGCGGAA 60 
GAGGCACGTG CGCTGCTG AA TG GAGCTGGT CGCTGGTTGC TACGAGCAGG TCCTCTTTGG 120 
GTTCGCTGTA CACCCGGAGC OCAAGGCTTG CGGCGACCAC GAGCAATGGA CTCTTGTGGC 180 
TGACTTCACT CACCATGCTC ACACTGCCTC CTTGTCAGCA GTAGCTGTAA ATAGTCGTTT 240 

65 TGTGGTCACT GGGAGCAAAG ATG AAACAAT TCACATTTAT GACATGAAAA AGAAGATTGA 300 
GCATGGGCCT CTAGTGCATC ACAGTGGT AC AATAACTTGC CTGAAATTCT ATGGCAACAG 360 
GCATTTAATC AGTGGAGCGG AAGATGGACT CATCTGTATC TGGGATGCAA AGAAATGGGA 420 
ATGCCTGAAG TCAATTAAAG CTCACAAAGG ACAGGTGAOC TTOCTTTCTA TTCACCCATC 480 

_ rt TGGCAAGTTG GCCCTGTCGG TTGGTACAGA TAAAACTTTA AGAACGTGGA ATCTTGTAGA 540 

70 AGGAAGATCA GCATTCATAA AAAATATAAA ACAAAATGCT CACATAGTAG AATGGTCOCC 600 
AAGAGGAG AG CAGTATGTAG TTATCATACA GAATAAAATA GACATCTATC AGCTTGACAC 660 
TGC ATCCATT AGTGGCACCA TCACAAATG A AAAGAG AATT TOC TCTGTT A AATTTCTTTC 720 
AGAGTCTGTC CTTGCAGTGG CTGGAGATGA AGAAGTTATA AGGTTTTTTG ACTGTGATTC 780 
ACTAGTGTGC CTCTGCG AAT TTAAAGCTCA TGAAAACAGG GTAAAGGACA TGTTCAGTTT 840 

75 TGAAATTCCA GAGCATCATG TTATTGTTTC AGCATCGAGT GATGGTTTCA TCAAAATGTG 900 
GAAGCTTAAG CAGGATAAGA AAGTTCCCCC ATCTTTACTC TGTGAAATAA ACACTAATGC 960 
CAGGCTGACG TGTCTTGGAG TGTGGCTAGA CAAAGTGGCA GACATGAAAA GCCTTCCTCC 1020 
AGCTGCAGAG CCTTCTCCTG TAAGTAAAGA ACAGTCCAAA ATTGGCAAAA AGGAGCCTGG 1080 
TG ACACAGTG CACAAAG AAG AAAAGCGGTC AAAACCTAAC ACAAAG AAAC GOGGTTTAAC 1140 
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AGGTGACAOT AAGAAAGCAA CAAAAGAAAG TGGCCTGATA TCAAOCAAGA AGAGGAAAAT 1200 
GGTAGAAATG TTGGAAAAGA AGAGGAAAAA GAAGAAAATA AAAACAATGC AGTGAATCAC 1260 
AGATGTCTCC TGAAAGAACT CTTTTAGATG AAATCATTCT ACTCAAATGT ACCTTAATTT 1320 
TTTTTTTTCC CTGAGTAAAA GCAAGAAATT TCTTCCTTTG GAAAAAATAT ATATATTAAA 1380 
AAAOCACTTT TAGATGGTTT TTTTTAAAAA AAAAAAAAAA ACTGGTAAAA TTACTTTTGO 1440 
CAGACAGTGT TTTATGAATT ATGTATCATG TTGATATATA ATATGTTAAT GTGTCATGTA 1500 
ATTTTTACTT TGTAC AAAGC AAATAAAGAT CTTTCTCAAA AAAAAAAAAA AAAA 



§EQ IP W0;1?g PFQt Prgt^n sequence; 

Protein Accession*: NP.060376.1 
1 11 21 31 41 51 

MELVAGCYEQ VLFGFAVHPE PKACGDHEQW TLVADFTHHA HTASLSAVAV NSRFWTGSK 60 
DETIHIYDMK KKIEHG ALVH HSGTTTCLKF YGNRHLISGA EDGLICSWD A KKWECLKSIK 120 
AHKGQVTFLS IHPSGKLALS VGTDKTLRTW NLVEGRSAFI KN1KQNAHIV EWSPRGEQYV 180 
VIIQNKIDIY QLDTASISGT ITNEKRISSV KFLSESVLAV AGDEEVIRFF DCDSLVCLCE 240 
FKAHENRVKD MFSFEIPEHH VIVSASSDGF IKMWKLKQDK KVPPSLLCEI NTNARLTCLG 300 
VWLDKVADMK SLPPAAEPSP VSKEQSKIGK KEPGDTVHKE EKRSKPNTKK RGLTGDSKKA 360 
TKESGUSTK KRKMVEMLEK KRKKKKDCIM Q 



SEQ tt> NO:153 PFD6 DNA SEQUENCE 

Nucleic Add Accession #: NM.014668 

Coding sequence: 11CK2953 (underlined sefjuences correspond to start and slop codons) 
I 11 21 31 41 51 

GATGTCT7GG ACATGCTCTG GCTGGCTAAT CTCCATGTTC TAGCGGACTG AAAATACGGT 60 
GGCCAAGTGG ATGGTGTGCT TATTTGCAGT CTAAAGAAAT TTCCTTTTGAJEGTGGCAGAA 120 
AATCGAGGAT GTGGAGTGGA GACCCCAGAC TTACTTGGAG CTGGAGGGTC TGCCTTGCAT 180 
CCTCATCTTC AGTGGG ATGG ACCCGCATGG GGAGTCCTTG CCGAGGTCTT TG AGGTACTG 240 
TGACCTGCGA TTGATAAACT CCTCCTGCTT GGTGAGAACA GOCTTGGAGC AGGAGCTGGG 300 
CCTGGCTGCC TACTTTGTGA GCAACGAGGT TCCCTTGGAG AAGGGGGCTA GGAACGAGGC 360 
CTTGGAGAGT GATGCTG AGA AGCTGAGCAG CACAG ACAAC GAGGATGAGG AGCTGGGGAC 420 
AGAAGGCTCT ACCTCGGAGA AGAGAAGCCC CATGAAAAGG GAGAGGTCCC GCTCCCACGA 480 
CTCAGCATCC TCATCCCTCT CCTCCAAGGC TTCCGGTTCA GCGCTCGGTG GCGAGTCCTC 540 
GGCTCAGCCC ACAGCACTCC CCCAGGGAGA GCATGCCAGG TCGCCCCAGC CCCGTGGCCC 600 
CGCAGAGGAG GGCAGAGCCC CTGGTGAGAA ACAGAGGCCC GGGGCAAG7C AGGGGCCACC 660 
CTCGGCCATC AGCAGGCACA GTCCCGGGCC GACGCCCCAG CCCGACTGTA GCCTCAGG AC 720 
OGGCCAGAGG AGCGTOCAGG TGTCGGTCAC CTCGTCGTGC TCCCAGCTGT CCTCCTDCTC 780 
GGGCTCATCC TCCTCATCCG TGGCGCCCGC TGCCGGCACG TGGGTCCTGC AGGCCTCCCA 840 
GTGCTCCTTG ACCAAGGCCT GCCGCCAGCC ACCCATTGTC TTCTTGCCCA AGCTCGTGTA 900 
CGACATGGTT GTGTCCACTG ACAGCAGTGG CCTGCCCAAG GOOGCCTCCC TCCTGCCCTC 960 
CCCCTCGGTC ATGTGGGCCA GCTCTTTCCG CTCCCTGCTC AGCAAGACCA TGACATCCAC 1020 
CGAGCAGTCC CTCTACTACC GGCAGTGGAC GGTGCCCCGG CCCAGCCACA TGGACTACGG 1080 
CAACCGGGCC GAGGGCCGCG TGGACGGCTT CCACCCCCGC AGGCTGCTGC TCAGCGGCCC 1140 
CCCTCAGATC GGGAAGACAG GTGCCTACCT GCAGTTCCTC AGTGTCCTGT CCAGGATGCT 1200 
TGTTC GGCTC ACAGAAGTGG ATGT C T A TGA CGAGGAGGAG ATCAATATCA ACCTCAGAGA 1260 
AGAATCTGAC TGGCATTATC TCCAGCTTAG CGACCCCTGG CCAGACCTGG AGCTGTTCAA 1320 
GAAGTTGCCC TTTGACTACA TCATTCACGA CCCGAAGTAT GAAGATGCCA GCCTGATTTG 1380 
TTCGCACTAT CAGGCTATAA AG AGTG AAG A CAGAGGGATG TCCCGGAAGC CGGAGGACCT 1440 
TTATGTGCGG CGTCAGACGG CACGGATGAG ACTGTCCAAG TACGCAGCGT ACAACACTTA 1500 
CCACCACTGT GAGCAGTGCC ACCAGTACAT GGGCTTCCAC CCOOGCTACC AGCTGTATGA 1560 
GTCCACCCTG CACGCCTTTG CCTTCTCTTA CTCCATGCTA GGAGAGGAGA TCCAGCTGCA 1620 
CnCATCATC CCCAAGTCCA AGGAGCACCA C n rGTCTTC AGCCAACCTG GAGGCCAGCT 1680 
GGAGAGCATG CGACTACCOC TCGTGACAGA CAAGAGCCAT GAATATATAA AAAGTCCGAC 1740 
ATTCACTCCA ACCACCGGCC GTCACG AACA TGGGCTCTTT AATCTGTACC ACGCAATGGA 1800 
CGGTGCCAGC CATTTGCACG TGCTGGTTGT CAAGGAATAC GAGATGGCAA TTTATAAGAA 1860 
ATATTGGCOC AACCACATCA TGCTGGTGCT CCCCAGTATC TTCAACAGTG CTGGAGTTGG 1920 
TGCTGCTCAT TTCCTCATCA AGGAGCTGTC CTACCATAAC CTGGAGCTCG AGCGGAACOG 1980 
GCAGGAGG AG CTGGGAATCA AGCCGCAGGA CATCTGGCCT TTCATTGTGA TCTCTGATGA 2040 
CTCCTGCGTG ATGTGGAACG TGGTGGATGT CAACTCTGCT GGGGAGAGAA GCAGGGAGTT 2100 
CTCCTGGTCG GAAAGGAACG TGTCTTTGAA GCACATCATG CAGCACATCG AGGCGGCCCC 2160 
CGACATCATG CACTACGCCC TGCTGGGCCT GOGGAAGTGG TCCAGCAAGA CCCGGGCCAG 2220 
CGAGGTGCAA G AGCCCTTCT CCOGCTGCCA CGTGCACAAC TTCATCATCC TGAACGTGGA 2280 
CCTGACCCAG AACGTGCAGT ACAACCAGAA CCGGTTCCTG TGTGACGATG TAGACTTCAA 2340 
CCTCCGGGTG CACAGCGCCG GCCTCCTGCT CTGCCGGTTC AACCGCTTCA GCGTGATGAA 2400 
GAAGCAG ATC GTGGTGGGCG GCCACAGGTC CTTCCACATC ACATCCAAGG TGTCTG ATAA 2460 
CTCTGCCGCG GTCGTGCCGG CCCAGTACAT CTGTGCCCCG GACAGCAAGC ACACGTTCCT 2520 
CGCAGCGCCC GCCCAGCTCC TGCTGGAGAA GTTCCTGCAG CACCACAGCC ACCTCTTCTT 2580 
CCCGCTGTCC CTG AAG AACC ATG ACCACCC AGTGCTGTCT GTCG ACTGTT ACCTG AACCT 2640 
GGGATCTCAG ATTTCTGTTT GCTATGTG AG CTCCAGGCCC CACTCTTTAA ACATCAGCTG 2700 
CTCGGACTTG CTGTTCAGTG GGCTGCTGCT GTACCTCTGT GACTCTnTG TGGG AGCTAG 2760 
CTTTTTGAAA AAGTTTCATT TTCTGAAAGG TGCGACGTTG TGTGTCATCT GTCAGGACCG 2820 
GAGCTCACTG CGCCAG ACGG TCGTCCGCCT GGAGCTXGAG GACGAGTGGC AGTTCCGGCT 2880 
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GCGCOATG AG TTCCAG ACCG OC AATGCCAG GGAAGACOGG CCGCTCTTTT TTCTG ACGGG 2940 
ACGACACATC TGAGGAAGAC AGCGGOGAGT TTTCTGAAGA GATGAGTGCT CAGAGCOCTC 3000 
ATGCTGTTG A GGCTAAAGGG AGGCCTGGAA CGGTGGGGCG TTTGACTGGA ATGGACCCCA 3060 
GGGACTGTCC AGGTGCAGCC CCTCCTAGTA CACATGGGCC CCCGAGGCCG TGGTCCTGGG 3120 
AGCCAGGAAG ACTCCGCAGT GGGTG AG AAT GAAAACTTGA GACTCCCAAG TTCTGGGCCA 3180 
GCOCATTGCT CTGGGCTGTT TTAAAGCCCA TTTCACGAGG AACAAAGATT TACTTCCTGT 3240 
CCTGCCATTC GTGTGCTTCC ATGGACAAAC CTGATTTTTT TCTCTTAGTT CTAAAGAATC 3300 
TTGGGTTATT TTGTAGCGGT GCCAGTATTT CAGTAGATGG GATTTCAGCC AAGTAGGTTC 3360 
CCCTGTAACC TCCTACAAAG CAATATTCCA AAGGAACATT TTAACTGTAA AGGCTGGAGA 3420 
CAAGAAAAAA TAAGTAGATC GTTTTAATAA CAATTATTTA ATTGCCTATA AGTTTGCTGT 3480 
TTCAGAGGCT AGCCCAAAGG CATCAAATTT AATAAAGTTA AACAAATTGA TTTACTTCAG 3540 
AGCAAATATG ATCCTATTAA AATAATATAG GGTAAATACC CTACCTCTTA G A AAGGGCAA 3600 
AAATGCAAAG AAGCTTTCTT TAAAACTAAA AGGGTTTTTT GGGGGGGGAG TTGGCGGGG A 3660 
GGAAATAAGG CTAACAGAGG TTGACCTAAA ATTAGCCTTA CAAAGGAGAA AGGACCACAT 3720 
TGCTTACTTG AAACAGACAA TGAAAACAAC CAAAGTGATA TATAAAATAG TTGATGAGAA 3780 
CTAGACTTAT GACTGTAGTT TACTAGAGTT TAGTTTTCAG TTGCTGAAGT AGCTCATTTT 3840 
CTCTTACTAA TGTTTGGTTC CTCAGGG AAG AATCTCACTT GACTAG AG AG GAGGTGGG AA 3900 
CAGAAGAGAO AAGGA GGCAG GGAG ATGTAT TTCTTAGGGC TCACCCCTTC ACAGACTGAC 3960 
AGAATGGTTT TGTTTTGTTT TGTTTTGTTT TGTTTTGTTT TTGAGATGGA CTCTAGCTCT 4020 
GTCACOCAGG CTGG AGTGCA GTGGTGCGAT CTCGGCTCAC TGCAAGCTCC GCCTCCCGGG 4080 
TTCTCACCAT TCTCCTGCCT CAGCCTCCCG AGTAGCTGGG ACTACAGGCG CCCACCACCA 4140 
CXJCCCGGCTA ATTTTTTGTA TTTTTTAGTA GAG ACGGGGT TTCACCATGT TAGCCAGGAT 4200 
GGTCTCGATC TCCTGACCTC GTGATCCGCC CGCCTCGGCC TCCCAAAGTG CTGGGATTAC 4260 
AGGOGTGAGC CAOCGTGCCT GCCCCAGAAT GGTTTTTAAA GCCACAGTTG AGAGGCCAOC 4320 
CATTGCCCGG CGCCTGGACA GTGATCATCT TGTTCATCTT GTTCAGTCCT TTCTTGTGTG 4380 
ATTGGAATTA TTCATCCCCT TTGAAAGATG AGAAGGTTGA G ATGCAAAGA GTCTACCTTT 4440 
CCAAGTTCTC ACTGCTGGAA AG AGCTAGAA GCACAGTTCA AAGTTCTGGC TTCTGGACTC 4500 
TGCAGTCCAG GTCTOCCTTC TCCCACTTGC CTACCCTCAA TGOCACACTG TTTTTGAAGT 4560 
GGCCCATAAC TTGAAGGAAA AGTTTAAAG A CAGTTCAATT TAATCATCAG AATGCATTCT 4620 
TTTTTTTTTC GGAGACGGAG TTTCACTCTT GCTGCCCAGG CTGGAGTGCA ATGGTGCAAT 4680 
GATCTCGGCT CACTGCAACC TCTGCCTCCT GGGTTCAAGT GATTCTOCAG CCTCAGOCTC 4740 
CCGAGTAGCT GGGATTATGG GCGCCCACCA CCATCCCCAG CTAATTTTTG TAJ 1 1 II 1 1 1 4800 
TTTTAGTAG A G ATGGGGTTT GGCCAGGTTG GCCAGGCTGG TCTTGTGAAC TCCTGGCCTC 4860 
AGGTGATCTG CCCACCTCAT CCTCCAAAAG TGCTGGGATT ACAGGCATGA GCCACTGCGC 4920 
CTGGCCTCAG AATGCATTCT TACACATCTA TCCTAG ACAT TTATAAGCAC TCTAATGGAT 4 980 
AACAATCCAA GAATAAATG A TTGTAAAAGA TGATGCCGAA GAGTTGATGT CAATCTTTTT 5040 
TTCCTAAGAA AAAAAGTCOG CGAGTATTAA ATATTTAGAT CAATGTTTAT AAAATGATTA 5100 
CTTTGTATAT CTCATTATTC CTATTTTGGA ATAAAAACTG ACCTTCTTTA ATCATATACT 5160 
TGTCTTTTGT AAATAGCAGC TTTIGTGTCA TTCTCCCCAC TTTATTAGTT AATTTAAATT 5220 
GGAAAAAACC CTCAAACTAA TATTCTTGTC TGTTCCAGTC TTATAAATAA AACTTATAAT 5280 
GCATG 



Protein Accession ft. NP.0554811 

1 11 21 31 41 51 
I I I I I I 

MWQKtEDVEW RPQTYLELEG LPQLZFSGM DFHGESLPRS LRYCDLRUN SSCLVRTALE 60 
QELGLAAYFV SNEVPLEKGA RNEALESDAE KLSSTDNEDE ELGTEGSTSE KRSPMKRERS 120 
RSHDSASSSL SSKASGSALG GESSAQPTAL PQGEHARSPQ PRGPAEEGRA FGEKQRFRAS 180 
QGPPSAISRH SPGPTPQPDC SLRTGQRSVQ VSVTSSCSQL SSSSGSSSSS VAPAAGTWVL 240 
QASOCSLTKA CRQPPiVFLP KLVYDMWST DSSGLPKAAS LLPSPSVMWA SSFRPLLSKT 300 
MTSTEQSLYY RQWTVPRPSH MDYGNRAEGR VDGEHPRRLL LSGPPQIGKT GAYLQFLSVL 360 
SRMLVRLTEV DVYDEEONI NLREESDWHY IjQLSDPWPDL ELFKKLPFDY UHDPKYEDA 420 
SLICSHYQGI KSEDRGMSRK PEDLYVRRQT ARMRLSKYAA YNTYHHCEQC HQYMGFHPRY 480 
QLYESTLHAF AF5YSMLGEE IQLHFUPKS KEHHFVFSQP GGQLESMRLP LVTDKSHEYI 540 
KSPTFTPTTG RHEHGLFNLY HAMDGASHLH VLWKEYEMA IYKKYWPNHI MLVLPSIFNS 600 
AGVGAAHFU KELSYHNLEL ERNRQEELGI KPQDIWPFIV ISDDSCVMWN WDVNSAGER 660 
SREFSWSERN VSLKHIMQHI EAAPDIMHYA LLGLRKWSSK TRASEVQEPF SRCHVHNFH 720 
LNVDLTQNVQ YNQNRFLCDD VDFNLRVHSA GLLLCRFNRF SVMKKQIWG GHRSFHTTSK 780 
VSDNSAAWP AQYICAPDSK HTFLAAPAQL LLEKFLQHHS HLFFPLSLKN HDHPVLSVDC 840 
YLNLGSQ1SV CYVSSRPHSL NISCSDLLFS GLLLYLCDSF VGASFLKKFH FLKGATLCVI 900 
CQDRSSLRQT WRLELEDEW QFRLRDEFQT ANAREDRPLF FLTGRHI 



SEQ ID NO:155 PFC6 DNA SEQUENCE 

Nucleic Acid Accession t. MM_000522 

Coding sequence: 1-11 67 (underlined sequences correspond to start and slop codons) 
1 II 21 31 41 51 

AJGACAGOCT C£GTCk!tCCT CCACCCCCGC TGGATCGAGC CCACCGTCAT GTTTCTCTAC 60 
GACAACGGCG gcggcctcgt ggccgaogag ctcaacaaga acatggaagg GGCGGOGGCG 120 
GCTGCAGCAG CGGCTGCAGC GGCGGCGGCT GCCGGGGCCG GGGGCGGGGG CmXCCCAC 180 
CCGGCGGCTG CGGCGGCAGG GGGCAACTTC TOGGTGCCGG CCGCGGCCGC GGCTGCGGCG 240 
GCCGCCGCGG CCAACCAGTO CCGCAACCTG ATGGCGCACC CGGCGCCCTT GGCGCCAGGA 300 
GCCGCGTCCG CCTACAGCAG CGCCCCCGGG GAGGCGCCCC CGTCGGCTGC CGCCGCTGCT 360 
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GCCGCGGCTG CCGCTGCAGC CGCCGCCGCC GCCGCCGCGT CGTCCTCGGG AGGTCCCGGC 420 
CCGGCGGGCC CGGCGGCGGC AGAGGCGGCC AAGCAATGCA GCCCCTGCTC GGCAGCGGCG 480 
CAGAGCTCOT CGGGGCCCGC GGCGCTGCCC TATGGCTACT TCGGCAGCGG CTACTACCCG 540 
TGCGCCCGCA TGGGCCCGCC CCCCAACGCC ATCAAGTCGT GCCCCCAGCC OCCCTCGGCC 600 
GCCGCCGCCG CCGCCTTCGC GGACAAGTAC ATGGATACCG CCGGCCCAGC TGGCGAGGAG 660 
TTCAGCTCCC GCGCTA AGGA GTTCGCGTTC TACCACCAGG GCTACGCAGC GGGGCCTTAC 720 
CACCACCATC AGCOCATGCC TGGCTACCTG GATATGCCAG TGGTGCCGGG CCTCGGGGGC 780 
CCCGGCGAGTCGCGCCACGA ACCCTTGGGT CTTCCCATGG AAAGCTACCA GCCCTGGGCG 840 
CTGCCCAACG GCTGG AACGG CCAAATGTAC TGCCCCAAAG AGCAGGCGCA GCCTCCCCAC 900 
CTCTGGAAGT CCACTCTGCC CGACGTGGTC TCCCATCCCT CGGATGCCAG CTCCTATAGG 960 
AGGGGGAGAA AGAAGCGCGT GCCTTATACC AAGGTGCAAT TAAAAGAACT TGAACGGGAA 1020 
TACGOCACG A ATAAATTCAT TACTAAGGAC AAACGGAGGC GGATATCAGC CACGACGAAT 1080 
CTCTCTGAGC GGCAGGTCAC AATCTGGTTC CAGAACAGGA GGGTTAAAGA GAAAAAAGTC 1140 
ATCAACAAAC TGAAAACCAC TAGTTAA 



seq m NChtse pfc<? pfgtgtn sewenre 
Protein Accession*: NP_00051i1 

1 11 21 31 41 51 
I I I I I I 

MTAS VLLHPR W1EPTVMFLY DNGGGLVADE LNKNMEG AAA AAAAAAAAAA AGAGGGGFPH 60 
PAAAAAGGNF SVAAAAAAAA AAAANQCRNL MAHPAPLAPG AASAYSS APG EAPPS AAAAA 120 
AAAAAAAAAA AAASSSGGPG PAGPAAAEAA KQCSPCSAAA QSSSGPAALP YGYPGSGYYP 180 
CARMGPPPNA KSCPQPPSA AAAAAFADKY MDTAGPAAEE FSSRAKEFAF YHQGYAAGPY 240 
HHHQPMPGYL DMPWPGLGG PGESRHEPLG LPMESYQPWA LPNGWNGQMY CPKEQAQPPH 300 
LWKSTLPDW SHPSDASSYR RGRKKRVPYT KVQLKELERE YATNKFITKD KRRRISATTN 360 
LSERQVTTWF QNRR VKEKKV INKLKTTS 



SEQ ID NCM57 PFAS DNA SEQUENCE 

NudeJc Acid Accession t: AW102723 

Coding sequence: 523-2676 (underlined sequences correspond to start end stop codons) 
1 11 21 31 41 51 

CCCTTATGGC GATTGGGCGG CFGCAGAGAC CAGGACTCAG TTCCCCTGCC CTAGTCTGAG 60 
CCTAGTGGGT GGGACTCAGC TCAGAGTCAG TTTTCAGAAG CAGGTTTCAG TTGCAGAGTT 120 
TTCCTACACT nTCCTGCGC TAGAGCAGCG AGCAGCCTGG AACAGACCCA GGCGGAGGAC 180 
ACCTGTGGGG GAGGGAGCGC CTGGAGGAGC TTAGAGACGC CAGCOGGGCG TGATCTCAOC 240 
ATGTGCGGAT TTGCGAGGCG CGCCCTGG AG CTGCTAGAGA TCCGGAAGCA CAGCCCCGAG 300 
GTGTGCG AAG OCAOCAAG AC TGCGGCTCTT GG AGAAAGOG TG AGCAGGGG GCCACCGCGG 360 
TCTCCGGCCT GTCTGCACCC TGTCGCCTGA GCTGCCTGAC AGTGACAATG ACATCCCAGT 420 
TACCAGTGTC CTTGAATTGA TAGTGGCTTC TGTTTGTCAG TCTCATATAA GAACTACAGC 480 
TCATCAGGAG GAGATCGCAG CAGGGTAAGA GACACCAACA CCATGTTCTG CACGAAGCTC 540 
AAGGATCTCA AGATCACAGG AG AGTGTCCT TICTCCTTAC TGGCACCAGG TCAAGTTCCT 600 
AACGAGTCTT CAGAGG AGGC AGCAGG AAGC TCAGAGAGCT GCAAAGCAAC CGTGCCCATC 660 
TGTCAAGACA TTCCTGAGAA GAACATACAA GAAAGTCTTC CTCAAAGAAA AACCAGTCGG 720 
AGCCGAGTCT ATCTTCACAC TTTGGCAGAG AGTATTTGCA AACTGATTTT CCCAGAGTTT 780 
GAACGGCTGA ATGTTGCACT TCAGAGAACA TTGGCAAAGC ACAAAATAAA AGAAAGCAGG 840 
AAATCTTTGG AAAG AG AAG A CTTTG AAAAA ACAATTGCAG AGCAAGCAGT GCAGCAGAGT 900 
CCAGTGGAGT TATCAAAGAA TCTCTTGGTG AAGAGGTTTT TAAAATATGT TACGAGGAAG 960 
ATGAAAACAT OCTTGGGGTG GTTGGAGGCA CCCTTAAAGA TTTTTAAACA GCTTCAGTAC 1020 
CCTTCTGAAA CAGAGCAGCC ATTGCCAAGA AGCAGG AAAA AGGGGCAGCT TGAGGACGCC 1080 
TCCATTCTAT GCCTGGATAA GGAGGATGAT TTTCTACATG TTTACTACTT CTTCCCTAAG 1140 
AG AACCACCT CCCTG ATTCT TCCCGGCATC ATAAAGGCAG CTGCTCACGT ATTATATG AA 1200 
ACGGAAGTGG AAGTGTCGTT AATGCCTCCC TGCTTCCATA ATGATTGCAG GGAU I'HUTG 1260 
AATCAGCCCT ACTTGTTGTA CTCCGTTCAC ATGAAAAGCA CCAAGCCATC CCTGTOCCCC 1320 
AGCAAACCCC AGTCCTCGCT GGTGATTCCC ACATCGCTAT TCTGCAAGAC ATTTCCATTC 1380 
CATTTCATGT TTG ACAAAGA TATGACAATT CTGCAATTTG GCAATGGCAT CAG AAGGCTG 1440 
ATGAACAGGA GAGACTTTCA AGGAAAGCCT AATTTTGAAT ACTTTGAAAT TCTOACTOCA 1500 
AAAATCAACC AGACCTTTAG CGGGATCATG ACTATGTTGA ATATGCAGTT TGTTGTACGA 1560 
GTGAGGAGAT GGGACAACTC TGTGAAGAAA TCI 1CAAGGG TTATGGACCT CAAAGGCCAA 1620 
ATGATCTACA TTGTTG AATC CAGTGCAATC TTGTTTTTGG GGTCACCCTG TGTGG ACAGA 1680 
TTAGAAGATT TTACAGG ACG AGGGCTCTAC CTCTCAGACA TCCCAATTCA CAATGCACTG 1740 
AGGGATGTGG TCTTAATAGO GGAACAAGCC CGAGCTCAAG ATGGCCTGAA GAAGAGGCTG 1800 
GGGAAGCXG A AGGCTAOOCT TGAGCAAGCC CACCAAGCCC TGGAGGAGGA GAAGAAAAAG 1860 
ACAGTAGACC TTCTGTGCTC CATATTTCCC TGTGAGGTTG CTCAGCAGCT GTGGCAAGGG 1920 
CAAGTTGTGC AAGCCAAG AA GTTCAGTAAT GTCACCATGC TCTTCTCAGA CATCGTTGGG 1980 
TTCACTGOCA TCTGCTOCCA GTGCICACCG CTGCAGGTCA TCACCATGCT CAATGCACTG 2040 
TACACTOGCT TCGACGAGCA GTGTGGAG AG CTGGATGTCT ACAAGGTGGA GACCATTGCG 2100 
ATGCCTATTG TGTGGCTTGG GGGATTACAC AAAGAGAGTG ATACTCATGC TGTTCAG ATA 2160 
GCGCTGATGG CCCTGAAGAT GATGG AGCTC TCTGATGAAG TTATGTCTCC CCATGGAGAA 2220 
CCTATCAAGA TGCGAATTGG ACTGCACTCT GGATCAGTTT TTGCTGGCGT CGTTGGAGTT 2280 
AAAATGCCCC GTTACTGTCT TTTTGGAAAC AATGTCACTC TGGCTAACAA ATTTGAGTCC 2340 
TGCAGTGTAC CACGAAAAAT CAATGTCAGC CCAACAACTT ACAG ATTACT CAAAGACTGT 2400 
(XTGGTTTCG TGTTTACCCC TCGATCAAGG GAGG AACTTC CACCAAACTT CCCTAGTGAA 2460 
ATCCCCGGAA TCTGCCATTT TCTGGATGCT TACCAACAAG GAACAAACTC AAAACCATGC 2520 
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TTCCAAAAGA AAGATGTGOA AQATGCAAGC CAATTTTTTA GGCAAAGCAT CAGGAATAQA 2580 
TTAGCAACCT ATATACCTAT TTATAAGTCT TTGGGGTTTG ACTCATTGAA GATGTGTAGA 2640 
GCCTCTGAAA GCACTTTAGG GATTGTAGAT GGCTAACAAG CAGTATTAAA ATTTCAGGAG 2700 
CCAAGTCACA ATCTTTCTCC TGTTTAACAT GACAAAATGT ACTCACTTCA GTACTTCAGC 2760 
TCTTCAAGAA AAAAAAAAAA AOCTTAAAAA GCTACTTTTG TGGGAGTATT TCTATTATAT 2820 
AACCAGCACT TACTACCTGT ACTCAAAATT CAGCACCTTG TACATATATC AGATAATTGT 2880 
AGTCAATTGT ACAAACTGAT GGAGTCACCT GCAATCTCAT ATCCTGGTGG AATGCCATGG 2940 
TTATTAAAGT GTGTTTGTGA TAGTTGTCGT CAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 3000 
AAAA 



SEQ ID NO:158 PFA3 Protein seouence: 
Protein Accession #: NP.000847.1 



1 U 21 31 41 51 
I I I I I I 

MPCTKLKDLK ITGECPFSLL APGQVPNESS EEAAGSSESC KATVPICQDI PEKNIQESLP 60 
QRKTSRSRVY LHTLAESICK UFPEFERLN VALQRTLAKH KIKESRKSLE REDFEKTIAE 120 
QAVQQSPVEL SKNLLVKRFL KYVTRKMKTS UjWLEAPLKI FKQLQYPSET EQPLPRSRKK 180 
GQLEDASILC LDKEDDFLHV YYFFPKRTTS ULPGIIKAA AHVLYETEVE VSLMPPCFHN 240 
DCSEFVNQPY LLYSVHMKST KPSLSPSKPQ SSLVIPTSLF CKTFPFHFMF DKDMTELQFG 300 
NGIRRLMNRR DPQGKPNFEY FEILTPKINQ TFSGIMTMLN MQFWRVRRW DNSVKKSSRV 360 
MDLKGQMIYI VESSAHJLG SPCVDRLEDF TGRGLYLSDI PIHNALRDW LIGEQARAQD 420 
GLKKRLGKLK ATLEQAHQAL EEEKKKTVDL LCSIFPCEVA QQLWQGQVVQ AKKFSNVTML 480 
FSDIVGFTAI CSQCSPLQVI TMLNALYTRF DQQCGELDVY KVETIAMPIV WLGGLHKESD 540 
THAVQIALMA LKMMELSDEV MSPHGEPIKM RIGLHSGSVF AGWGVKMPR VCLPGNNVTL 600 
ANKFESCSVP RKINVSPTTY RLLKDCPGFV FTPRSREELP PNFPSEIPGI CHFLDAYQQG 660 
TNSKPCPQKK DVED ASQPFR QSKNRLATY fflYKSUGFD SLKMCRASES TLGIVDG 



SEQ ID N0:159 PFA1 ONA SEQUENCE 

NudeJc Acid Accession*: NM.004362 

Coding sequence; 102-1934 {underlined sequences correspond to start and stop codecs) 

1 11 21 31 41 51 
I I I I I I 

CGCCGGCGGG ACTGGTCTGA AGAGACGCGG GGACAAAGTG GCAACGACTT GGACATCTGA 60 
GCTGTCACTG CCGAAAACAG GCCGCAAGAG AGATAATCAA TATGCATTTC CAAGCCTTTT 120 
GGCTATGTTT GGGTCTTCTG TTCATCTCAA TTAATGCAGA ATTTATGGAT GATGATGTTG 180 
AGACGG AAGA CTTTGAAGAA AATTCAGAAG AAATTGATGT TAATGAAAGT GAACTTTOCT 240 
CAG AGATTAA ATATAAG ACA CCTCAAGCIA TAGG AG AAGT ATATTTTGCA GAAACTTTTG 300 
ATAGTGGAAG GTTGGCTGGA TGGGTCTTAT CAAAAGCAAA G AAAGATG AC ATGG ATGAGG 360 
AAATTTCAAT ATACGATGGA AGATGGG AAA TTGAAGAGTT GAAAGAAAAC CAGGTACCTG 420 
GTGACAGAGG ACTGGTATTA AAATCTAG AG CAAAGCATCA TGCAATATCT GCTGTATTAG 480 
CAAAACCATT CATTTTTGCT GATAAACCCT TGATAGTTCA ATATGAAGTA AATTTTCAAG 540 
ATGGTATTGA TTGTGGAGGT GCATACATTA AACTCCTAGC AGACACTGAT GATTTGATTC 600 
TGGAAAACTT TTATGATAAA ACATCCTATA TCATTATGTT TGGACCAG AT AAATGTGGAG 660 
AAGATTATAA ACTTCATTTT ATCTTCAGAC ATAAACATCC CAAAACTGGA GTTTTCGAAG 720 
AGAAACATGC CAAACCTCCA GATGTAG ACC TTAAAAAGTT CTTTACAGAC AGGAAGACTC 780 
ATCTTTATAC CCTTGTGATG AATCCAGATG ACACATTTGA GGTGTTAGTT GATCAAACAG 840 
TTGTAAACAA AGGAAGCCTC CTAGAGGATG TGGTTCCTCC TATCAAACCT CCCAAAG AAA 900 
TTGAAGATCC CAATGATAAA AAACCTGAGG AATGGGATGA AAGAGCAAAA ATTCCTGATC 960 
CTTCTGCCGT CAAACCAGAA GACTGGGATG AAAGTGAACC TGCCCAAATA GAAGATTCAA 1020 
GTGTTGTTAA ACCTGCTGGC TGGCTTG ATG ATGAACCAAA ATTTATCCCT GATCCTAATG 1080 
CTGAAAAACC TGATGACTGG AATGAAG ACA CGGATGGAGA ATGGGAGGCA CCTCAGATTC 1140 
TTAATCCAGC ATGTCGGATT GGGTGTGGTG AGTGGAAACC TCCCATGATA GATAACCCAA 1200 
AATACAAAGG AGTATGGAGA CCTCCACTGG TCGATAATCC TAACTATCAG GGAATCTGGA 1260 
GTCCTCGAAA AATTCCTAAT CCAGATTATT TCGAAGATGA TCATCCATTT CTTCTGACTT 1320 
CTTTCAGTGC TCTTGGTTTA GAGCTTTGGT CTATGACCTC TGATATCTAC TTTGATAATT 1380 
TTATTATCTG TTCGGAAAAG GAAGTAGCAG ATCACTGGGC TGCAGATGGT TGGAGATGGA 1440 
AAATAATGAT AGCAAATGCT AATAAGCCTG GTGTATTAAA ACAGTTAATG GCAGCTGCTG 1500 
AAGGGCACCC ATGGCTTTGG TTGATTTATC TTGTGACAGC AGGAGTGCCA ATAGCATTAA 1560 
TTACTTCATT TTGTTGGCCA AGAAAAGTAA AGAAAAAACA TAAAGATACA GAGTATAAAA 1620 
AAACCGACAT ATGTATACCA CAAACAAAAG GAGTACTAGA GCAAGAAGAA AAGGAAGAGA 1680 
AAGCAGCCCT GGAAAAACCA ATGGACCTGG AAGAGGAAAA AAAGCAAAAT GATGGTGAAA 1740 
TGCTTGAAAA AGAAGAGGAA AGTGAACCTG AGGAAAAGAG TGAAGAAGAA ATTGAAATCA 1800 
TAGAAGGGCA AGAAGAAAGT AATCAATCAA ATAAGTCTGG GTCAGAGGAT GAG ATGAAAG 1860 
AAGCAGATGA GAGCACAGGA TCTGGAG ATG GGCCGATAAA GTCAGTACGC AAAAGAAGAG 1920 
TACGAAAGGA CTAAACTAGA TTGAAATATT TTTAATTCCC GAGAGGATGT TTGGCATTGT 1980 
AAAAATCAGC ATGCCAGACC TGAACTTTAA TCAGTCTGCA CATCCTGTTT CTAATATCTA 2040 
GCAACATTAT ATTCTTTCAG ACATTTATTT TAGTCCTTCA TTTCCGAGGA AAAAGAAGCA 2100 
ACTTTGAAGT TACCTCATCT TTGAATTTAG AATAAAAGTG GCACATTACA TATCGGATCT 2160 
AAGAGATTAA TACCATTAGA AGTTACACAG TTTTAGTTGT TTGGAGATAG TTTTGGTTTG 2220 
TACAGAACAA AATAATATGT AGCAGCTTCA TTGCTATTGG AAAAATCAGT TATTGGAATT 2280 
TCCACTTAAA TGGCTATACA ACAATATAAC TGGTAGTTCT ATAATAAAAA TGAGCATATG 2340 
TTCTGTTGTG AAGAGCTAAA TGCAATAAAG TTTCTGTATG GTTGTTTG AT TCTATCAACA 2400 
ATTG AAAGTO TTGTATATGA CCCACATTTA CCTAGTTTGT GTCAAATTAT AGTTACAGTG 2460 
AGTTGTTTGC TTAAATTATA GATTCCTTTA AGGACATGCC TTGTTCATAA AATCACTGGA 2520 
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TTATATTGCA GCATATTTTA CA TTTGA ATA CAAGGATAAT GGGTTTTATC AAAACAAAAT 2580 
GATOTACAGA TTTTTTTTCA AGTTTTTATA GTTGCTTTAT GCCAGAGTGG TTTACCCCAT 2640 
TCACAAAATT TCTTATGCAT ACATTGCTAT TGAAAATAAA ATTTAAATAT TTTTTCATCC 2700 
TGAAAAAAAA 



SEQ 10 N0;160PFA1 Protein samence: 
Protein Accession* NPJXM353.1 

1 11 21 31 41 51 
I 1 ! I I I 

MHFQAFWLCL GLLFISINAE FMDDDVETED FEENSEHDV NESELSSEXK YKTPQPIGEV 60 
YFAETFDSGR LAGWVLSKAK KDDMDEEISI YDGRWEIEEL KENQVPGDRG LVLKSRAKHH 120 
AISAVLAKPF EFADKPLIVQ YEVNFQDGID CGGAYDCLLA DTDDLILENF YDKTSYHMF 180 
GPDKCGEDYK LHFDPRHKHP KTGVFEEKHA KPPDVDLKKF FTDRKTHLYT LVMNPDDTFE 240 
VLVDQTWNK GSLLEDWPP 1KPPKEIEDP NDKKPEEWDE RAKIPDPSAV KPEDWDESEP 300 
AQEDSSWK PAGWLDDEPK FIPDPNAEKP DDWNEDTDGE WEAPQILNPA CRIGCGEWKP 360 
PMIDNPKYKG VWRPPLVDNP NYQGIWSPRK IPNPDYFEDD HPFLLTSFSA LGLELWSMTS 420 
DIYH)NFnC SEKBVADHWA ADGWRWKIMI ANANKPGVLK QLMAAAEGHP WLWLIYLVTA 480 
GVP1AUTSF CWPRKVKKKH KDTBYKKTDI OPQTKGVLE QEEKEEKAAL EKPMDLEEEK 540 
KQNDGEMLEK EEESEPEEKS EEEIEHEGQ EESNQSNKSG SEDEMKEADE STGSGDGPIK 600 
SVRKRRVRKD 



SEQ 0) N0:161 PEZ9 DMA SEQUENCE 

Nucleic Add Accession t: NM_005932 

Coding sequence: 75-221 6 (underlined sequences correspond to start and step codons) 

1 11 21 31 41 51 
I I I I ! I 

GCGG AGCGCG CGCTCCCAGC GAAAGCAGCA GGGCAGGGAT CTGCGTTGGA GGAAGGGACT 60 
GCTCTGGTGC TAGAATGCTG TGCGTCGGAA GGCTGGGCGG CTTGGGAGCC AGAGCAGCAG 120 
CTCTGCCGCC CCGCCGGGCG GGCCGGGGAA GCCTOGAAGC CGGGATCCGG GCCCGAAGGG 180 
TCAGCACCAG CTGGTCTCCC GTGGGCGCCG CCTTCAATGT CAAGCCCCAG GGCAGCCGCT 240 
TGGACCTGTT CGGCX3AGCGG GCGCGTCTTT TTGGAGTTCC TGAGCTGAGT GCCCCAGAAG 300 
GATTTCATAT TGCACAAGAA AAAGCCTTGA GAAAGACAGA ATTGCTTGTG GACCGTGCAT 360 
GTTCCACCCC ACCTGGGCCC CAGACCGTGC TGATCTTCGA TGAGCTCTCG GATTCCTTAT 420 
GCAG AGTGGC CGACTTGGCT GATTTTGTGA AAATCX3CTCA CCCTGAGCCA GCATTCAG AG 480 
AAGCTGCGGA AG AAGCTTGT AGAAGTATTG GCACCATGGT AGAGAAGTTG AACACAAATG 540 
TGGATTTATA TCAAAGTTTG CAAAAATTAC TAGCTG ATAA AAAACTTGTG GATTCCCTTG 600 
ATCCAGAAAC AAGGCGAGTG GCTGAACTGT TTATGTTTGA TTTTGAAATT AGTGGAATCC 660 
ATCTAGACAA ACAAAAGCGT AAAAG AGCAG TQGACCTCAA TGTTAAAATC TTGGATTTGA 720 
GTAGTACATT TCTTATGGGA ACCAATTTTC CCAACAAGAT TGAGAAGCAT CTCTTACCAG 780 
AACACATTCG TCGTAACTTT ACATCTGCTG GGGATCATAT CATAATTGAT GGTCTCCACG 840 
CAGAATCACC AGATGACTTG GTGCGAGAAG CTGCTTATAA A ATTTTTCIT TATCCCAATG 900 
CTGGTCAATT GAAATGTTTA GAAGAATTGC TCAGCAGCAG AGATCTTCTG GCAAAGTTGG 960 
TGGGGTATTC CACGTTTTCT CACAGGGCTC TCCAAGGAAC GATAGCTAAA AATCCAGAGA 1020 
CTOTCATGCA GTTDCTTGAA AAACTATCTG ACAAACTTTC TGAAAGAACT CTGAAAGATT 1080 
TTG AG ATG AT ACGAGGGATG AAAATG AAAC TGAATGCTCA AAATTCCG AA GTAATGCCCT 1140 
GGGACCCCCC TTACTA CAGT GGTGTGATTC GTGCAGAAAG GTATAATATT GAGOCCAGOC 1200 
TATATTGCCC GTTTTTCTCT CTTGGAGCAT GCATGGAAGG CCTGAATATT TTGCTTAACA 1260 
GACTGTTGGG GATTTCATTA TATGCAGAGC AGCCTGCAAA AGGAGAGGTG TGGAGCGAAG 1320 
ATGTCCG AAA ACTGGCTGTT GTTCATGAAT CTGAAGGATT GTTGGGGTAC ATTTACTGTG 1380 
ATTTTTTTCA GCGAGCAGAC AAACCACATC AGGATTGCCA TTTCACTATC CGTGGAGGCA 1440 
GACTAAAGGA AGATGGAG AC TATCAACTOC CACTTGTAGT TCTTATGCTG AATCTTCCCC 1500 
GTTCCTCAAG GAGTTCTCCA ACTTTGCTAA CTCCTGGCAT GATGGAAAAT CTTTTCCATO 1560 
AAATGGGACA TGCCATGCAT TCAATGCTAG GACGTACTCG TTACCAACAC GTCACTGGGA 1620 
OCAGGTGCCC TACTGATTTT GCTGAGGTTC CTTCTATTCT GATGGAGTAC TTTGCAAATG 1680 
ATTATCGAGT AGTTAACCAA TTTGCCAG AC ATTATCAGAC TGGACAGCCA CTGCCAAAAA 1740 
ATATGGTGTC TCGTCTTTGT G AATCTAAAA AGGlTltflGC TGCAGCTGAT ATGCAACTTC 1800 
AGGTCTTTTA TGCCACTCTG GATCAAATCT ACCATGGGAA GCATCCCCTG AGGAATTCAA 1860 
CCACAGACAT TCTCAAGG AA ACACAAG AG A AATTCTATGG CCTACCATAT GTTCCAAATA 1920 
CTGCCTGGCA GCTGCGATTC AGCCACCTCG TGGGGTATGG TGCTAGATAT TACTCTTACC 1980 
TCATGTCCAG AGCGGTCGCC TCCATGGTTT GG AAGGAGTG TTTTCTACAG GATCCTTTCA 2040 
ACAGGGCTGC CGGGGAGCGC TATCGCAGGG AGATGCTGGC CCACGGTGGA GGCAGGGAGC 2100 
CCATGCTCAT GGTTGAAGGT ATGCTTCAGA AGTGTCCTTC TGTTGATG AC TTCGTAAGTG 2160 
CCCTCGTTTC CGACTTGGAT CTGGACTTCG AAACTTTCCT CATGGATTCT GAATAAAAGA 2220 
AACACTCTAC ACCTCTAATC AAGGTCATGT AGTAATG ACT TTGTTATAAA TGCTACAGCT 2280 
GTGAGAGCTT GTTTCTGATT GTTTCATTGT TCGCTTCTGT AATTCTGAAA AACTTTAAAC 2340 
TGGTAGAACT TGGAATAAAT AATTTGTTTT AATTAAAAAA AAAAAAAAAA AA 



SEP ID NO:162 PE29 PrntPJn ^wn.P^ 
Protein Accession t: NP.005923.1 

1 11 21 31 41 51 
I I I I I I 

MLCVGRLGGL GARAAALPPR RAGRGSLEAG IRARRVSTSW SPVGAAFNVK PQGSRLDLFG 60 
ERARLFG VPE LS APEGFH1A QEKALRKTEL LVDRACSTPP GPQTVUFDE LSDSLCRVAD 120 
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LADFVKIAHP EPAFREAAEE ACRSIGTMVE KLNTNVDLYQ SLQKLLADKK LVDSLDPETR 180 
RVAELFMFDF EISGIHLDKQ KRKRAVDLNV KILDLSSTFL MGTNFPNKEE KHLLPEHIRR 240 
NFTSAGDHH DXjLHAESPD DLVREAAYKI FLYPNAGQLK CLEELLS SRD LLAKLVGYST 300 
FSHRALQGTI AKNPETVMQF LEKLSDKLSE RTLKDFEMIR GMKMKLNAQN SEVMPWDPPY 360 
YSGVtRAERY KIEPSLYCPF FSLGACMEGL NILLNRLLGI SLYAEQPAKG EVWSEDVRKL 420 
AWHESEG1X GYIYCDFFQR ADKPHQDCHF HRGGRLKED GD YQLPLWL MLNLPRSSRS 480 
SPTLLTPGMM ENLFHEMGHA MHSMLGRTRY QHVTGTRCPT DFABVPSILM BYFANDYRW 540 
NQFARHYQTG QPLPKNMVSR LCESKKVCAA ADMQLQVFYA TLDQIYHGKH PLRNSTTDIL 600 
KETQEKFYGL PYVPNTAWQL RFSHLVGYGA RYYSYLMSRA VASMVWKECF LQDPFNRAAG 660 
ERYRREMLAH GGGREPMLMV EGMLQKCPSV DDFVS ALVSD LDLDFETFLM DSE 



SEQ 10 K0:163 PEZB DNA SEQUENCE 

Nucleic Add Accession* AF103S07 

Coding sequence: none (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 
I I I I I I 

ACAGAAGAAA TAGCAAGTGC CGAGAAGCTG GCATCAGAAA AACAGAGGGG AGATTTGTGT 60 
GGCTGCAGCC GAGGGAGACC AGGAAGATCT GCATGGTGGG AAGGACCTGA TGATACAGAG 120 
GAATTACAAC ACATATACTT AGTGTTTCAA TGAACACCAA GATAAATAAG TGAAGAGCTA 180 
GTCCGCTGTG AGTCTCCTCA GTGACACAGG GCTGGATCAC CATCGACGGC ACTTTCTGAG 240 
TACTCAGTGC AGCAAAGAAA GACTACAGAC ATCTCAATGG CAGGGGTGAG AAATAAGAAA 300 
GGCTGCTGAC TTTAOCATCT GAGGCCACAC ATCTGCTGAA ATGGAGATAA TTAACATCAC 360 
TAG AAACAGC AAG ATG ACAA TATAATGTCT AAGT AGTG AC ATGTmTGC AC ATTTCCAG 420 
CCCCTTTAAA TATCCACACA CACAGGAAGC ACAAAAGGAA GCACAGAGAT OCCTGGGAGA 480 
AATGCCCGGC CGCCATCTTG GGTCATCGAT GAGCCTCGCC CTGTGCCTGG TCCCGCTTGT 540 
GAGGGAAGGA CATTAGAAAA TGAATTGATG TGTTCCTTAA AGGATGGGCA GGAAAACAGA 600 
TCCTGTTGTG GATATTTATT TG AACGGGAT TACAGATTTG AAATGAAGTC ACAAAGTGAG 660 
CATTAOCAAT GAGAGGAAAA CAGACG AGAA AATCTTGATG GCTTCACAAG ACATGCAACA 720 
AACAAAATGG AATACTGTGA TGACATGAGG CAGCCAAGCT GGGGAGGAGA TAACCACGGG 780 
GCAG AGGGTC AGG ATTCTGG CCCTGCTGCC TAAACTGTGC GTTCATAACC AAATCATTTC 840 
ATATTTCTAA CCCTCAAAAC AAAGCTGTTG TAATATCTG A TCTCTACGGT TCCTTCTGGG 900 
CCCAACATTC TCCATATATC CAGCCACACT CATTTTTAAT ATTTAGTTCC CAG ATCTGTA 960 
CTGTGACC1 1 TCI ACACTGT AOAATAACAT TACTCATTTT GTTCAAAGAC CCTTCGTGTT 1020 
GCTGCCTAAT ATGTAGCTGA CTGiMTTCC TAAGGAGTGT TCTGGCCCAG GGGATCTGTG 1080 
AACAGGCTGG GAAGCATCTC AAGATCTTTC CAGGGTTATA CTTACTAGCA CACAGCATGA 1140 
TCATTACGGA GTGAATTATC TAATCAACAT CATCCTCAGT GTCTTTGCCC ATACTGAAAT 1200 
TCATTTCCCA CTTTTGTGCC CATTCTCAAG ACCTCAAAAT GTCATTCCAT TAATATCACA 1260 
GGATTAA C1 1 ' 11 I ' ll 11 1A A CCTGGAAGAA TTCAATGTTA CATGCAGCTA TOGGAATTTA 1320 
ATTACATATT TTGTTTTCCA GTGCAAAGAT GACTAAGTCC TTTATCCCTC COCTTTGTTT 1380 
GATTTTTTTT CCAGTATAAA GTTAAAATGC TTAGCCTTGT ACTGAGGCTG TATACAGCAC 1440 
AGCCTCTCCC CATCCCTCCA GCCTTATCTG TCATCACCAT CAACCCCTCC CATACCACCT 1500 
AAACAAAATC TAACTTGTAA TTOCTTGAAC ATGTCAGG AC ATACATTATT CCTTCTGCCT 1560 
GAGAAGCTCT TCCTTGTCTC TTAAATCTAG AATGATGTAA AGTTTTGAAT AAGTTGACTA 1620 
TCTTACTTCA TGCAAAG AAG GO ACACATAT G AG ATTCATC ATCACATGAG ACAGCAAATA 1680 
CTAAAAGTGT AATTTGATTA TAAGAGTTTA GATAAATATA TGAAATGCAA GAGCCACAG A 1740 
GGGAATGTTT ATGGGGCACG TTTGTAAGCC TGGGATGTGA AGCAAAGGCA GGGAACCTCA 1800 
TAGTATCTTA TATAATATAC TTCATTTCTC TATCTCTATC ACAATATCCA ACAAGCTTTT 1860 
CACAGAATTC ATGCAGTGCA AATCCCCAAA GGTAACCTTT ATCCATTTCA TGGTGAGTGC 1920 
GCTTTAGAAT TTTGGCAAAT CATACTGGTC ACTTATCTCA ACTTTGAGAT GTGTTTGTCC 1980 
TTGTAGTTAA TTGAAAGAAA TAGGGCACTC TTGTGAGCCA CTTTAGGGTT CACTCCTGGC 2040 
AATAAAGAAT TTACAAAGAG CTACTCAGG A CCAGTTGTTA AGAGCTCTGT GTGTGTGTGT 2100 
GTGTGTGTGT GAGTGTACAT GOCAAAGTGT GCCTCTCTCT CTTGACCCAT TATTTCAGAC 2160 
TTAAAACAAG CATGTTTTCA AATGGCACTA TGAGCTGCCA ATGATGTATC ACCACCATAT 2220 
CTCATTATTC TCCAGTAAAT GTGATAATAA TGTCATCTGT TAACATAAAA AAAGTTTGAC 2280 
TTCACAAAAG CAGCTGGAAA TGGACAACCA CAATATGCAT AAATCTAACT CCTACCATCA 2340 
GCTACACACT GCTTGACATA TATTGTTAGA AGCAOCTOGC ATTTGTGGGT TCTCTTAACC 2400 
AAAATACTTG CATTAGGTCT CAGCTGGGGC TGTGCATCAG GCGGTTTGAG AAATATTCAA 2460 
TTCTCAGCAG AAGCCAGAAT TTGAATTCCC TCATCTTTTA GGAATCATTT ACCAGGTTTG 2520 
GAGAGGATTC AG ACAGCTCA GGTGCTTTCA CTAATGTCTC TGAACTTCTG TCOCTCTTTG 2580 
TGTTCATGG A TAGTCCAATA AATAATGTTA TCTTTGAACT GATGCTCATA GGAGAG AATA 2640 
TAAG AACTCT GAGTGATATC AACATTAGGG ATTCAAAG AA ATATTAGATT TAAGCTCACA 2700 
CTGGTCAAAA GGAACCAAGA TACAAAGAAC TCTGAGCTGT CATCGTCCCC ATCTCTGTGA 2760 
GCCACAACCA ACAGCAGG AC CCAACGCATG TCTGAG ATCC TTAAATCAAG GAA ACCAGTG 2820 
TCATGAGTTG AATTCTCCTA TTATGGATGC TAGCTTCTGG COVTCTCTCG CTCTCCTCTT 2880 
GACACATATT AGCTTCTAGC CTTTGCTTCC ACG ACTTTTA TCTTTTCTCC AACACATCGC 2940 
TTACCAATCC TCTCTCTGCT CTGTTGCTTT GGACTTCCCC ACAAGAATTT CAACOACTCT 3000 
CAAGTCTTTT CTTCCATCCC CACCACTAAC CTGAATGCCT AGACCCTTAT TTTTATTAAT 3060 
TTCCAATAG A TGCTGCCTAT GGGCTATATT GCTTTAGATG AACATTAG AT ATTTAAAGCT 3120 
CAAG AGGTTC AAAATCCAAC TCATTATCTT CTCTTTCTTT CACCTCCCTG CTOCTCTCCC 3180 
TATATTACTG ATTGCACTGA ACAGCATGGT CCCCAATGTA GCCATGCAAA TGAGAAACCC 3240 
AGTGGCTCCT TGTGGTACAT GCATGCAAGA CTGCTGAAGC CAGAAGGATG ACTGATTACG 3300 
CCTCATGGGT GGAGGGGACC ACTCCTGGGC CTTCGTGATT GTCAGGAGCA AGACCTGAGA 3360 
TGCTCCCTGC CTTCAGTGTC CTCTGCATCT CCCCTTTCTA ATG AAG ATCC ATAGAATTTG 3420 
CTACATTTGA GAATTCCAAT TAGGAACTCA CATGTTTTAT CTGCCCTATC AATTTTTTAA 3480 
ACTTGCTGAA AATTAAGTTT TTTCAAAATC TGTCCTTGTA AATTACTTTT TCTTACAGTG 3540 
TCTTGGCATA CTATATCAAC TTTG ATTCTT TGTTACAACT TTTCTTACTC TTTTATCACC 3600 
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AAAGTCGCTT TTATTCTCTT TATTATTATT ATTTTCTTTT ACTACTATAT TACOTTGTTA 3660 
TTATTTTGTT CTCTATAGTA TCAATTTATT TGATTTAGTT TCAATTTATT TTTATTGCTG 3720 
ACTTTTAAAA TAAGTGATTC GGGGGGTGGG AGAACAGGGG AGGGAGAGCA TTAGGACAAA 3780 
TACCTAATGC ATGTGGGACT TAAAACCTAG ATGATGGGTT GATAGGTGCA GCAAACCACT 3840 
ATGGCACACG TATACCTGTG TAACAAACCT ACACATTCTG CACATGTATC CCAGAACGTA 3900 
AAGTAAAATT TAAAAAAAAG TGA 



PEffi protein mxm 

1U Protein Accession*: 



15 



SEQ ID mm PEZ6 DNA SEQUENCE 

Nucleic Add Accession*: AB028945 

Coding sequence: 1-3765 (underlined sequences correspond to start and stop codons) 



1 I! 21 31 41 51 

ATGATGATQA ACOTCCCCGG CGGAGGAGOG GCCGCGGTGA TGATGACGGG CTACAATAAT GO 
GGTCGCTGTC CCCGGAATTC TCTCTACAGT GACTGCATTA TTGAGGAGAA GACGGTGGTC 120 

20 CTGCAGAAAA AAG ACAATG A GGGCTTTGGA TTCGTGCTTC GAGGGGCCAA AGCTGACACA 180 
CCCATTG AAG AATTCACACC AACACCGGCT TTCCCAGCCC TACAGTACCT GGAGTCCGTG 240 
GATGAAGGTG GGGTGGCGTG GCAAGCCGGA CTAAGGACOG GGGAC11C11 GATTGAGGTT 300 
AACAATGAGA ATGTTGTCAA AGTCGGCCAC AGGCAGGTGG TGAACATGAT COGGCAGGGA 360 
GGGAATCACC TGGTCCTTAA GGTGGTCACG GTGACCAGGA ATCTGGACCC CGACGACACC 420 

25 GCCAGGAAGA AAGCTCCCCC GCCTCCAAAG CGGGCACCGA CCACAGCCCT CACCCTGCGC 480 
TCCAAGTCCA TGACCTCGGA GCTGGAGGAG CTCGTGGATA AAG ATAAACC CGAGGAGATA 540 
GTCCCGGCCT CCAAGCCCTC CCGCGCTGCT GAGAACATGG CTGTGGAACC GAGGGTGGCG 600 
ACCATCAAGC AGCGGCCCAG CAGCCGGTGC TTCCCGGCGG GCTCAGACAT GAACTCTGTG 660 
TACGAACGCC AAGGAATCGC CGTGATGACG CTCACTGTTC CTGGGAGCCC AAAAGCCCCG 720 

30 TrrCTGGGCA TOCCTCGAGG TACG ATGCGA AGGCAGAAAT CAATAG ACAG CAGAATCTTT 780 
CTATCAGGAA TAACAGAGGA AGAGOGGCAG TTTCTGGCTC CTCCAATGCT GAAGTTCACC 840 
AG AAGCCTGT CCATGCCGGA CACCTCTGAG GACATCCCCC CTCCACCGCA GTCTGTGCCC 900 
CCGTCCCCAC CACCACCTTC CCCAACCACT TACAACTGCC CCAAGTCCCC AACTCCAAGA 960 
GTCTACGGGA CGATTAAGCC TGCGTTCAAT CAGAATTCTG CCGCCAAGGT GTOCCCCGCC 1020 

35 ACCAGGTCCG ACACCGTGGC CACCATGATG AGGGAGAAGG GGATGTACTT CAGGAGAGAG 1080 
CTGGACCGCT ACTCCTTGGA CTCTGAAGAC CTCTACAGTC GGAATGCCGG CCCGCAAGCC 1140 
AACTTCCGCA ACAAGAGAGG CCAGATGCCA GAAAACCCAT ACTCAGAGGT GGGGAAGATC 1200 
GCCAGCAAAG CCGTCTACGT CCCCGCCAAG CCCGCCAGGC GGAAGGGGAT GCTGGTGAAG 1260 
CAGTCCAACG TGGAGGACAG CCCCGAGAAG ACGTGCTCCA TCCCTATCCC GACCATCATC 1320 

40 GTGAAGGAGC CGTCCACCAG CAGCAGCGGC AAGAGCAGCC AGGGCAGCAG CATGGAGATC 1380 
GACCOCCAGG CCCCGGAGCC ACOGAGCCAG CTGCGGCCTG ACGAAAGCCT GACCGTCAGC 1440 
AGCCCCTTTG CCGCCGCCAT CGCCGGAGCC GTCCGCGACC GTGAGAAGCG GCTGGAAGCC 1500 
AGGAGGAACT COCOGGOCTT OCTCTCCACA GAOCTGGGGG ATGAGGATGT GGGCCTGGGG 1560 
CCACCCGCCC CCAGGACGCG GCCCTOCATG TTCCCCGAGG AGGGGGATTT TGCTGACGAG 1620 

45 GACAGCGCTG AGCAGCTGTC ATCCCCCATG CCGAGTGCCA CGCCCAGGGA GCCCGAAAAC 1680 
CATTTCGTGG GTGGCGCCGA GGCCAGTGCT COGGGTGAGG CTGGGAGGCC GCTGAATTCC 1740 
ACGTCCAAAG GCCAGGGGCC CGAGAGCAGC CCAGCAG1GC CCTOOGGGAG CAGCGGCACA 1800 
GCCGGCGCOG GGAATTATGT CCAOCCACTC ACAGGGCGGC TGCTTGATCC CAGCTCCCCG 1860 
CTGGCCCTGG CACTCTDCGC AAGGGACCGA GCCATGAAGG AGTCTCAACA GGGACCCAAA 1920 

50 GGGGAGGOCC CCAAGGCCGA GCTCAACAAA CCTCTTTACA TTGATACCAA AATGCGGCCC 1980 
AGCCTGGATG CCGGCTTCCC TACGGTCACC AGGCAGAACA OCOGGGGACC CCTGAGGCGG 2040 
CAGGAGACGG AGAACAAGTA OGAGAOCGAC CTGGGCGGAG ACCGGAAAGG CGATG ACAAG 2100 
AAGAACATGC TGATCGACAT CATCGACACG TCCCAGCAGA AGTCGGCTGG CCTGCTGATC 2160 
GTGCACACCG TGGACGCCAC TAAGCTGGAC AACGCCCTGC AGGAAGAGGA OGAGAAGGCA 2220 

55 GAGGTGGAGA TGAAGCCAGA CAGCTCGCCG TCCGAGGTGC CAGAAGGTGT TTCCGAAACC 2280 
GAAGGTGCTT TACAGATCTC CGCTGCCCCC GAGCCCACCA CCGTGCCCGG CAGAACCATC 2340 
GTCGCGGTGG GCTCCATGGA AGAGGCGGTG ATTTTGCCAT TCCGCATCCC TCCTCCCCCT 2400 
CTGGCATCCG TGGACTTGGA TGAGGATTTT ATTTTTACAG AGCCATTCCC TCCTCCCCTG 2460 
GAATTTGCAA ATAGTTTTG A TATCCCCGAT GACCGGGCAG CTTCTGTCCC GGCTCTCTCA 2520 

60 GACTTAGTGA AGCAGAAG AA AAGCOACACC OCTCAGTCCC CTTCGTTGAA CTCCAGCCAA 2580 
CCAACCAACT CTGCAGACAG CAAGAAGCCA GCCAGTCTTT CAAACTGTCT GCCTGCCTCA 2640 
TTCCTGCCAC CCCCTGAAAG CTTTGACGCC GTCGCCGACT CTGGGATCGA GGAGGTGGAC 2700 
AGCCGGAGTA GCAGCG ACCA CCACCTCGAG ACGACCAGCA CTATCTCCAC CGTCTCTAGC 2760 
ATCTCCACCC TGTCTTCCG A AGGTGGAGAG AATGTGGACA CCTGCACAGT CTATGCAGAT 2820 

65 GGGCAAGCAT TTATGGTTGA CAAACCCCCA GTACCTCCTA AGCCAAAAAT GAAGCCCATC 2880 
ATTCACAAAA GCAATGCACT TTATCAAG AC GCGCTCGTGG AAG AAGATGT AGATAGCTTT 2940 
GTTATCCCCC CGCCCGCTCC CCCGCCCCCG GCGGGCAGTG CCCAGCCTGG GATGGCCAAG 3000 
GTTCTCCAGC CAAGGAOCTC CAAGTTGTGG GGOGACGTCA CAGAGATCAA AAGCCCGATT 3060 
CTCTCAGGCC CAAAGGCAAA CGTTATTAGT GAATTGAACT CTATCCTACA GCAAATGAAC 3120 

70 CGAGAGAAAT TGGCAAAGCC GGGGGAAGG A CTGGATTCAC CAATGGGAGC CAAGTCCGCC 3180 
AGCCTCGCTC CAAGAAGCCC GGAGATCATG AGCACCATCT CAGGTACACG GAGCACGACG 3240 
GTCACCTTCA CTGTTCGCOC CGGCACCTCC CAGCCCATCA CCCTGCAGAG CCGGCCCCCC 3300 
GACTATGAAA GCAGGACCTC AGGAACAAGA CGTGCCCCAA GCCCTGTGGT CTCGCCAACA 3360 
GAGATGAACA AAGAGACCCT GCCCGCCCCC CTOTCTGCTG CCACCGCCTC TCCTTCTCCC 3420 

75 GCTCTCTCAG ATGTCTTTAG CCTTCCAAGC CAGCCCCCTT CTGGGGATCT ATTTGGCTTG 3480 
AACCCAGCGO GACGCAGTAG GTCGCCATCC CCCTCGATAC TGCAACAGCC AATCTCAAAT 3540 
AAGCCTTTTA CAACTAAACC TGTCCACCTG TGGACTAAAC CAGATGTGGC OGATTGGCTG 3600 
GAAAGTCTAA ACTTJGGGTGA ACATAAAGAG GCCTTCATGG ACAATGAG AT CGATGGCAGT 3660 
CACTTACCAA ACCTGCAGAA GGAGGACCTC ATCGATCTTG GGGTAACTCG AGTCGGGCAC 3720 
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AGAATGAACA TAGAAAGGGC TTTGAAACAG CTGCTGGACA GAIAAGGACG GCTGCTCTCC 3780 
ACCTOGCAGA CTCCTCTTGT TATAAGTAGA G ATGGGCTCG TGCTGAAACA TCTG AATGOC 3840 
AAGCG AAGTC TGTG AGC ATC AACCCCACTC CATGGGTTTG TCTCCTGGTA CCCA AAGAAA 3900 
TACTGAGTTG TGTCCACAAC ATGGCTGGGT CTTCAGACCC CTGGCTCACC ATGTGGGTGT 3960 
CTTGGGCAGT TTCTATCACA CATGGGACAA GGGGAGGGAG TTTTTCTAAC ATGGAAAAAG 4020 
ATTCCCAGCC TGCCGCCCAG CATGCAGGTG GCCTCX3CTTT GCCGGOTCCG AGAGGCTOCC 4080 
CGTCAATTTT GCACGGGATC CTAGCTCTTG TAGGCAGACA CCAGTGCACT CTAGATACCT 4140 
CCTGAGACCT CCGTCCTCTG CTTTCCGGGC AGCTCTCACC ACCCCAGGCC CCGGCATGAG 4200 
GCCTTTCCTC AGTCCTGTGG CCTCTCAGAG GACACCTGAT GCTCACCTGC CCCTCTTTCT 4260 
CCTGCACTTG GCTTGCAGTG AOATGCTCCC AGATGCATTT GTCCAGTGCC CCATCATGGG 4320 
CCTOAAAGGC AGAOAAA C1 1 TTTCCTACAC AGA TTCl 1 1 1 CCCCATCTCC TCCTGTGGTT 4380 
TGCATCCATG GCTCTTTGGC CATGAGGTTC CTGGCAGTGC TGGGAGTTTG GATGGGATOG 4440 
TCKXCAGCTT TGCTTAGCTT TCTTTATTTC TGCAAATCTG TTAGCATAAT TCCAAGGTGG 4500 
CCAAGCAGAT GTCACATGGA GTTAGTCAAA GCACAAAGTC ACGATTCCAC AATGGAGGGG 4560 
AGACCTGGCC AAGGGAGCCA GCCAGCGTGC AACTGCCCAA GCTCCAGGTC TCCAGGACAA 4620 
GAGCAGTTGT CTGCCATG AG CACCCATCCA GGATGGAGAA TAAGGGCTTC TCTGCCTCTC 4680 
AG AATTCTTT TTAATTG AAG ATGTCTTGAG CTCTGCAAAG ATCAGAGCAG GTGAGCATCC 4740 
ACTTTGACAT GAAGGACAAG AAGACGCATO GCTCATGGCG GGCACATGCG GCTGCCAGTG 4800 
AGACAGCGTC TCCTCTGGGA GCTGGGCGGG CACAGCATCC TCAGTTCTGT GCCCAGCCAA 4860 
GGGTG AGCAT CTCTGCTGAG ACAGTCCTTT TGCTCTCGGA GGCCAGGGAA G ATGGTACIT 4920 
AGAGGCTTTT OCOCTATCGC TCTGGGTGTC TAGG AATOCC ACCAGCTTGT CTTAACAGTA 4980 
CAACAGCTTC TTTGAGGAGC CAGTGGGTAT GGAGTATAGA CAGAACCCAG GGTTGAGAAC 5040 
AGAAGGTGGG CGGCAGGATC AGAGTGAAAG CAGAGGCGTG AGGAGAGGAA AGCAGGGAGG 5100 
TCTCCTGGGC TGOCAGGTCA GCCTCTCTGG CAAGGCTTTC TTGAGCCOCG OOOCTTTCIT 5160 
TOCOCGG AGT CCCTCCACCC CATAACAATA CCTCGAATTT CCAAAAGAGG TCACCAOATG 5220 
CACATCGGCC GCAAAACACA CAGTCAGGCT TCCAGCACAT TCTCCCCCAT TTGGAGGATA 5280 
CTCGAATGTC A G G ' 1 ' 1 1 1 1G G TTTTATTATT ATTTCAGAA C TAGCTCAGCC CATCTCTAAT 5340 
TATAAAACAT GGTTTTGTTT TTTTTTTTTC CTTTTTTTCT TGATTAGGTC TGGAACAGCT 5400 
CTAGAATGAA CACATAAAAT TTAGCAATTT AAAATCTTTC TTTACTGCAA GTTTAAATAG 5460 
TTOTACAGAT AGTTTATAAG CACAATATTT TAAGAAAAAA AAGTGGCTGG TCTACTAGGC 5520 
AGCCTTTGTG CCACTTCAGT GCTAGAAAGT TAAAGAAAAA AAAACTTTTG TGATTTAATA 5580 
ATACTATTTC TGTGGAATAA TTATAAAAGT ATGACCTTTT TAAATCAACC TTATTTGGAT 5640 
GCATCTGAAC CAGCAGAGCT GTGTTATATT TTCTATCTTT GCTAGAACTT CGTCATTGAA 5700 
GGACAATTTC TTCAAAGTGG TTAC AATTCA TAATGCAGCA GTTTCTCCAA AAACAAAAAC 5760 
AAAACACACA CCACACACAC GCGC1 II1CC AGTCACACAC CCCTGATGTT GOAACCAAGT 5820 
TTTTGGACCT TCTGTTCCAA AACCTTTTGC AGGTCAATCT TTGTATTTGA AATGATCCAA 5880 
TCCAACTTGA AGTCAATTGA ATATTAAGGC GCTTTACTTC CGTGTGCTTT CAGTTTTTCC 5940 
ATCATGAGAT GAATGAGCAT TACTCTAGAT AAATTTCAAG ACAGGATACT ACAGGTGGCC 6000 
TGCTG AGGCT GCCCCATATT TTAGAAAATG TAAAAATGGT GGTTTGGCCA TTAATTTGTC 6060 
TTCCATTTGA TGATACCGCA AAATTCCGTG AGTCCATTCC TTTGGCATGG CACTTTCCCT 6120 
GGGCCTACAG TTGGTATTAC CTCTGTGCTC AGTGCCAGGC AAAACACTAG CTCAAAGGAG 6180 
AGTCAAGGAA ACCGCTGGCA GACGATAAOC AGTCGAAACT CGTGACTTGG GTTTGTTGAA 6240 
CTTTGGCAGC CAGTTGGTGA GGGCCAGATG TTATTCCCTT TCTTAAAGAT ACTCCAAGGC 6300 
ACATGCCACT AACCACAAGC AAGCTGGCTG CAAGACTAAA GAGCTGATAA CATAGTTTAT 6360 
TTTTACACTG TCTTATTATA GAG AAGTAAT AGACCTATCA GAACCTGCAC TGACCAACAA 6420 
ATAAACACAT GTTGCCAAGA TGAATCGGTC TCTATCTCTA TCTGCTTATT TTGGTACTGA 6480 
AAGCAATAGT TCCTCATTCA AATCACCACC CACTGTTCTC OCCCTTTGGG ACATGTTAGG 6540 
AOGAGGCCCT ATTCCATGCC CC1 VI 1 1AAT GGTGGAACAA ATGTTAAACT GCICATCTAA 6600 
AGATCATGTT GATATTATTC CAGGTTTTAA GATCAACTTT TGTTACATAC TGTAATTTAA 6660 
ATAAACTGCA TTTACATGCC TAGTTTCTGT AATATTGTGT ATACAAAACC CAAATCTCTC 6720 
AAAATGTAAA TTATGTATAC CTGCCAAG AT ACCTTTTCCA GGGTGTCTGC GCACATTTTA 6780 
AGTTAATTCA CATAATATAA AAATTACTCA ATGTGACTGT TGATTTGCTG AACTTTACAT 6840 
ATCACAAAGT GAATTATTTG TG ATACTTTA GTTAATAAAA TGGTAAATTT TTTTCTCAGT 6900 
TATTGAACAA GCAAGCATTA TCCAGTTG AT CTGGCAATGA CI 1 11 10 TGTCTGGCCCACA 6960 
ATATTGATTT TOCCATTAAC AA TITI ITI 1 T G 11 1 T1 T AA A TACTAA TAT GTTTCACACT 7020 
ATAGTTTGTG T AACAACACG TGTTCGCATT ATCTATGTTG CTGTTACTTT TGTGCTTTT A 7080 
TTCTTTTTA G ACTTTATAAA AAAAAAAAAA AGCTCCTGTA ATTTGCACTT TCTCCCAATC 7140 
CTTAAATCTC TTGTATCGCA ACCAAAATTA CTGTAAAAAA ATAAATATAC TATTGCACTA 7200 
AGGTTGTGGT TCTG ATTGCA AACAAACAGT GAACACTGTC TGAATTAAAC AAAAAGCTGC 7260 
CCGACTTGCA ATCTAATGTA GATTATCTCA GGCATTGTGG CCAGCTCTGC CTCTCTAAAA 7320 
CTGACCAGAA AAATCTCTCT CATCGAGTAA ACAGGCTOCT GTCACTGAGC TAATCTGCCT 7380 
TGGTTCCATT TCCTTATTCT CAATTTATCA ATGGATACGT GCATGTTATT TCAGAATTAT 7440 
GCAAAACGTC AAAATCTGCT TCTGTGACCG CTGCTATAGG CGTGGAGCTG AGGCTCGGCT 7500 
TTTOC T TTT G TTCTGGGTGG AAGCAGCGGT GCCGCGGAGG GCCAGCCAGA TCCGGACCCT 7560 
TCCCTTAGGG TCCAGTCTOC CCACACCCCA GCAGGGTGTC TTCTAGCCAT AAGGCCAAGG 7620 
GAGTGGCAGA ACTGGGCCGC CICICIUGIT GACAAGCAAA CCACATGCTA AGGCTTGGAG 7680 
CAAGAGAG AA TTTGTGTCTA TTGGCAAAGA ACTAAGCCAG GAAGACATGG GCCATCCCTC 7740 
OGCTTTAGGG AAGCATATTT TAAACCTAAA CGTTGAACTT CTTCTTTGGC CTCACCAGTG 7800 
AAAACTTGTT GTCTTTAGTT CCTAAAGTTT CTTCTACTTT GGCACATTCC CCAGTTG AGC 7860 
AGCAGCCTCT ATGCTTCCAC GTTCAGG AAA AATTCCAGTC CTCATATCTT TTGTAGTTCA 7920 
COCTCAAGCT CTCCCGCTTC ACCATCC AAT AGTTTCTCCC AAACCTTGGC ACCCCCCTAG 7980 
ACTTTGCTTC CAATGGTTTC TTCCAGACCA CTTTTCCTAG ATGAATATAT TCGTTTACCT 8040 
TACTAGGAAA ATTATTGGAA GATTTTTTCT TTTACTTGAA ATTGGAGGCA TTTTAATAAC 8100 
TGGCG AACTG GAATGTGTTT CTGTATTTGT AGACAACCAT GTACCCATGC AAGTAGGTGA 8160 
ACATTCCACA GTGGCTGGGT GACCACAGCA GCTGCATGCA GACAGGACTG CCCGTGCTTT 8220 
GTGGGGAATC AGAG AATTTC CAAACTTGTT TCTCAGACTT CCGCAG ATCT CATCACTTTG 8280 
ATTTCTAATC CATGCTGTAT TGGTGATTTT GTTTATCGTT CCTGTAACTT GTTCTACATT 8340 
CCACAGTCTT T ACCGTTTTA TGTTCAAAAT TACAACAATC CCTGTCCATT GATTCCACTC 8400 
TGGAACTCTT TGTTCATGCC AATTTTG AAA TTTTAATACG AGCCTTCAAA TAAACACAGA 8460 
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AAAGAAAAAA AAAAAAAAAA AAAAAAAA 



SEQ ID 110:165 PEZS Protein sequence: 
Protein Accession*: BAA82974.1 



1 11 21 31 41 51 

MMMNVPGGGA AAVMMTGYNN GRCPRNSLYS DCUEEKTW LQKKDNEGPG FVLRGAKADT 60 
PIEEFTPTPA FPALQYLESV DEGGVAWQAG LRTGDFLZEV NNENWKVGH RQWNMKQG 120 
GNHLVLKVVT VTRNLDPDDT ARKKAPPPPK RAPTTALTLR SKSMTSELEE LVDKDKPEEI 180 
VPASKPSRAA ENMAVEPRVA TIKQRPSSRC FPAGSDMNSV YERQGIAVMT PTVPGSPKAP 240 
FLGIPRGTMR RQKSIDSRIF LSGITEEERQ FLAPPMLKFT RSLSMPDTSE DIPPPPQS VP 300 
PSPPPPSPTT YNCPKSPTPR VYGTIKPAFN QNSAAKVSPA TRSDTVATMM REKGMYFRRE 360 
LDRYSLDSED LYSRNAGPQA NFRNKRGQMP ENPYSEVGKI ASKAVYVPAK PARRKGMLVK 420 
QSNVEDSPEK TCSIPIPTII VKEPSTSSSG KSSQGSSMEI DPQAPEPPSQ LRPDESLTVS 480 
SPFAAAIAGA VRDREKRLEA RRNSPAFLST DLGDEDVGLG PPAPRTRPSM FPEEGDFADE 540 
DSAEQLSSPM PSATPREPEN HFVGGAEASA PGEAGRPLNS TSKAQGPESS PAVPSASSGT 600 
AGPGNYVHPL TGRLLDPSSP LALALSARDR AMKESQQGPK GEAPKADLNK PLYIDTKMRP 660 
SLDAGFPTVT RQNTRGPLRR QETENKYETD LGRDRKGDDK KNMLIDIMDT SQQKS AGLLM 720 
VHTVDATKLD NALQEEDEKA EVEMKPDSSP SEVPEGVSET EGALQISAAP EPTTVPGRTI 780 
VAVGSMEEAV ILPFRIPPPP LASVDLDEDF IFTEPLPPPL EPANSFDIPD DRAASVPALS 840 
DLVKQKKSDTPQSPSLNSSQ PTNSADSKKP ASLSNCLPAS FLPFPESFDA VADSGIEEVD 900 
SRSSSDKHLE TTSTBTVSS ISTLSSEGGE NVDTCTVYAD GQAFMVDKPP VFPKPKMKPI 960 
IHKSNALYQD ALVEEDVDSF VffPPAPPPP PGSAQPGMAK VLQPRTSKLW GDVTEJKSPI 1020 
LSGPKANVIS ELNSILQQMN REKLAKPGEG LDSPMGAKSA SLAPRSPEIM STBGTRSTT 1080 
VTFTVRPGTS QP1TLQSRPP DYESRTSGTR RAPSPVVSPT EMNKETLPAP LSAATASPSP 1 140 
ALSD VFSLPS QPPSGDLFGL NPAGRSRSPS PSILQQPISN KPFTTKPVHL WTKPD VADWL 1200 
ESLNLGEHKE AFMDNETDGS HLPNLQKEDL ©LGVTRVGH RMNffiRALKQ LLDR 



SEQ ID N0:166 PEZ4 DNA SEQUENCE 

Nudeic Add Accession*: NMJJ00024 

Cooing sequence: 220*1461 (underlined sequences correspond to start and slop codons) 
1 11 21 31 41 51 

ACTGCGAAGC GGCTIUITCA G AGCACGGGC TGGAACTGGC AGGCACCGCG AGCCCCTAGC 60 
ACCCGACAAG CTGAGTGTGC AGGACGAGTC CCCACCACAC CCACACCACA GCCGCTGAAT 120 
GAGGCTTCCA GGCGTCCGCT CGCGGCCCGC AGAGCCCCGC CGTGGGTCCG CCCGCTGAGG 180 
CGOCCCCAGC CAGTGCGCTT ACCTGCCAGA CTGCGOKX^J]GGGGCAACC CGGGAACGGC 240 
AGCGCCTTCT TGCTGGCACC CAATAGAAGC CATGCGCCGG ACCACGACGT CACGCAGCAA 300 
AGGGACGAGG TGTGGGTGGT GGGCATGGGC ATOGTCATGT CTCTCATCGT CCTGGCCATC 360 
GTGTTTGGCA ATGTGCTGGT CATCACAGCC ATTGCCAAGT TOGAGOGTCT GCAGACGGTC 420 
ACCAACTACT TCATCACTTC ACTGGCCTGT GCTGATCTGG TCATGGGCCT GGCAGTGGTG 480 
OOCTTTOGGO CCGCCCATAT TCTTATGAAA ATGTGGACTT TTGGCAACTT CTGGTGCGAG 540 
TTTTGGACTT CCATTGATGT GCTGTGCGTC ACGGCCAGCA TTGAGACCCT GTGCGTGATC 600 
GCAGTGGATC GCTACTTTGC CATTACTTCA CCTTTCAAGT ACCAGAGOCT GCTGACCAAG 660 
AATAAGGCCC GGGTGATCAT TCTGATGGTG TGGATTGTGT CAGGCCTTAC CTCCTIOTG 720 
COCATTCAGA TGCACTGGTA CCGGGOCACC CACCAGGAAG CCATCAACTG CTATGCCAAT 780 
GAGACCTCCT GTGA L ' UCl 1 CACGAACCAA GCCTATGCCA TTGOCTCTTC CATOGTGTCC 840 
TTCTACGTTC CCCTGGTGAT CATGGTCTTC GTCTACTCCA GGGTC 1 1 ICA GGAGGCCAAA 900 
AGGCAGCTOC AG AAGATTGA CAAATCTGAG GGCOGCTTCC ATGTCCAGAA CCTTAGCCAG 960 
GTGGAGCAGG ATGGGCGGAC GGGGCATGG A CTOOGCAGAT CTTCCAAGTT CTGCTTGAAG 1020 
GAGCACAAAG CCCTCAAGAC GTTAGGCATC ATCATGGGCA CTTTCACCCT CTGCTGGCTG 1080 
CCCTTCTTCA TCGTTAACAT TGTGCATGTG ATCCAGGATA ACCTCATCCXj TAAGGAAGTT 1140 
TACATCCTCC TAAATTGGAT AGGCTATGTC AATTCTGGTT TCAATCCCCT TATCTACTGC 1200 
CGGAGCCCAG ATTTCAGGAT TGCCTTCCAG GAGCTTCTGT GCCTGCGCAG GTCTTCTTTG 1260 
AAGGCCTATG GGAATGGCTA CTOCAGCAAC GGCAACACAG GGGAGCAGAG TGGATA TCAC 1320 
GTGGAACAGG AGAAAGAAAA TAAACTGCTG TGTGAAGACC TCCCAGGCAC GGAAGACTTT 1380 
GTGGGCCATC AAGGTACTGT GCCTAGCGAT AACATTGATT CACAAGGGAG GAATTGTAGT 1440 
ACAAATGACT CACTGCT GTA A AGCAGTTTT TCTACTTTTA AAGACCCCCC CXTCCCCAAC 1500 
AG AACACTAA ACAGACTATT TAACTTGAGG GTAATAAACT TAGAATAAAA TTGTAAAAAT 1560 
TGTATAG AG A TATGCAGAAG GAAGGGCATC CTTCTGCCTT TTTTATTTTT TTAAGCTGTA 1620 
AAAAGAGAGA AAACTTATTT GAGTG ATTAT TTGTTATTTG TACAGTTCAG TTCC7XTTTG 1680 
CATGGAATTT GTAAGTTTAT GTCTAAAGAG CTTTAGTCCT AGAGGACCTG AGTCTGCTAT 1740 
ATTTTCATGA CTTTTCCATG TATCTACC7X ACTATTCAAG TATTAGGGGT AATATATTGC 1800 
TGCTGGTAAT TTGTATCTGA AGGAGATTTT CCTTCCTACA CCCTTGGACT TGAGGATTTT 1860 
GAGTATCTCG GACCTTTCAG CTGTGAACAT GGACTCTTCC CCCACTCCTC TTATTTGCTC 1920 
ACACGGGGTA TTTTAGGCAG GGATTTGAGG AGCAGCTTCA GTTGTTTTCC CGAGCAAAGG 1980 
TCTAAAGTTT ACAGTAAATA AAATGTTTGA CCATG 



SEQ ID NO:167 PE74 Protein seouence: 
Protein Accession t NP.000015.1 
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I I I I I I 

MGQPGNGSAF LLAPKRSHAP DHDVTQQRDE VWWGMGIVM SUVLAIVFG NVLVTTAIAK 60 
FERLQTVTNY HTSLACADL VMGLAVVPPG AAHILMKMWT FGNFWCEFWT SIDVLCVTAS 120 
JETLCVIAVD RYFAFTSPFK YQSLLTKNKA RVHLMVWIV SGLTSFLPtQ MHWYRATHQE 180 
AINCY ANETC CDFFTNQAYA IASSI VSFYV PLVMVFVYS RVFQEAKRQL QKIDKSEGRF 240 
HVQNLSQVEQ DGRTGHGLRR SSKFCLKEHK AUCTLGIIMG TFTLCWLPFF IVNIVHVIQD 300 
NURKEVYIL LNWIGYVNSG FNPLIYCRSP DFRIAFQELL CLRRSSLKAY GNGYSSNGNT 360 
GEQSGYHVEQ EKENKLLCED LPGTEDFVGH QGTVPSDNID SQGRNCSTND SIX 



SEQ ID NW68 PEZ1 ONA SEQUENCE 

Nucleic Add Accession t. NM.004457 
15 Coding sequence: 143-2305 (undented sequences correspond to start and stop codons) 

1 11 21 31 41 SI 

20 GAATTCGTTG TTGGGAAGGA CTGGGGAAAC AGCTGTAACA TTTGCCACCC TCAGAAGCTG 60 
CTGGTCCTGT GTCACAOCAC CTTAGCCTCT TGATCG AGGA AG ATTCTCGC TGAAGTCTGT 120 
TAATTCTACT TTTTGAGTAC TTATQAATAA CCAOGTGTCT TCAAAACCAT CTACCATGAA 180 
GCTAAAACAT ACCATCAACC CTATTCTTTT ATATTTTATA CATTTTCTAA TATCACTTTA 240 
TACTATnTA ACATACATTC CGTTTTATTT TTTCTCCG AG TCAAG ACAAG AAAAATCAAA 300 

25 CCGAATTAAA GCAAAGCCTG TAAATTCAAA ACCTGATTCT GCATACAGAT CTGTTAATAG 360 
TTTGGATGGT TTGGCTTCAG TATTATACCC TGGATCTGAT ACTTTAGATA AA G1 ill 1 A C 420 
ATATGCAAAA AACAAATTTA AG AACAAAAG ACTCTTGGG A ACACGTG AAG TTTTAAATG A 480 
GGAAGATGAA GTACAACCAA ATGGAAAAAT TTTTAAAAAG GTTATTCTTG GACAGTATAA 540 
TTGGCTTTCC TATGAAGATG TCTTTGTTCG AGCCTTTAAT TTTGGAAATG GATTACAGAT 600 

30 GTTGGGTCAG AAACCAAAGA CCAACATCGC CATCTTCTGT GAGACCAGGG CCGAGTGGAT 660 
GATAGCTGCA CAGGCGTGTT TTATGTATAA TTTTCAGCTT GTTACATTAT ATGCCACTCT 720 
AGGAGGTCCA GCCATTG7TC ATGCATTAAA TGAAACAGAG GTGACCAACA TCATTACTAG 780 
TAAAGAACTC TTACAAACAA AGTTGAAGGA TATAGTTTCT TTGGTCCCAC GCCTGCGGCA 840 

. _ CATCATCACT GTTG ATGG AA AGCCACCGAC CTGGTCCG AC TTCCCCAAGG GCATCATTGT 900 

35 GCATACCATG GCTGCAGTGG AGGCCCTGGG AGCCAAGGCC AGCATGGAAA ACCAACCTCA 960 
TAGCAAACCA TTGCCCTCAG ATATTGCAGT AATCATGTAC ACAAGTGGAT CCACAGGACT 1020 
TCCAAAGGGA GTCATGATCT CACATAGTAA CATTATTGCT GGTATAACTG GGATGGCAGA 1080 
AAGG ATTCCA GAACTAGGAG AGGAAGATGT CTACATTGGA TATTTGCCTC TGGCCCATGT 1140 
TCTAGAATTA AGTGCTG AGC TTGTCTGTCT TTCTCACGG A TGCCGCATTG GTTACTCTTC 1200 

40 ACCACAGACT TTAGCAGATC AGTCTTCAAA AATTAAAAAA GGAAGCAAAG GGG ATACATC 1260 
CATGTTG AAA CCAACACTG A TGGCAGCAGT TCCGG AAATC ATGG ATCGGA TCT ACAAAAA 1320 
TGTCATGAAT AAAGTCAGTG AAATGAGTAG TTTTCAACGT AATCTGTTTA TTCTGGCCTA 1380 
TAATTACAAA ATGGAACAGA TTTCAAAAGG ACGTAATACT CCACTGTGCG ACAGCTTTGT 1440 
TTTCCGGAAA GTTCGAAGCT TGCTAGGGGG AAATATTCGT CTCCTGTTGT GTGGTGGCGC 1500 

45 TCCACTTTCT GCAACCACGC AGCGATTCAT GAACATCTGT TTCTGCTGTC CTGTTGGTCA 1560 

GGGATACGGG CTCACTGAAT CTGCTGGGGC TGGAACAATT TCCGAAGTGT GGOACTACAA 1620 
TACTGGCAGA GTGGGAGCAC CATTAGTTTG CTGTGAAATC AAATTAAAAA ACTGGGAGG A 1680 
AGGTGGATAC TTTAATACTG ATAAGCCACA CCCCAGGGGT GAAATTCTTA TTGGGGGCCA 1740 

_,_ AAGTGTGACA ATGGGGTACT ACAAAAATGA AGCAAAAACA AAAGCTGATT TCTCTG AAG A 1800 

50 TGAAAATGG A CAAAGGTGGC TCTGTACTGG GGATATTGGA GAGTTTGAAC CCGATGGATG 1860 
CTTAAAG ATT ATTGATCGTA AAAAGGACCT TGTAAAACTA CAGGCAGGGG AATATGTTTC 1920 
TCTTGGGAAA GTAGAGGCAG CTTTGAAGAA TCTTCCACTA GTAGATAACA TTTGTGCATA 1980 
TGCAAACAGT TATCATTCTT ATGTCATTGG ATTTGTTGTG CCAAATCAAA AGGAACTAAC 2040 
TGAACTAGCT CGAAAGAAAG GACTTAAAGG GACTTGGGAG GAGCTGTGTA ACAGTTGTGA 2100 

55 AATGGAAAAT GAGGTACTTA AAGTGCTTTC CGAAGCTGCT ATTTCAGCAA GTCTGGAAAA 2160 
GTTTGAAATT CCAGTAAAAA TTCGTTTGAG TCCTGAACCG TGG ACCCCTG AAACTGGTCT 2220 
GGTGACAGAT GCCTTCAAGC TG AAACGCAA AGAGCTTAAA ACACATTACC AGGCGGACAT 2280 
TGAGCGAATG TATGGAAGAA AATAATTATT CTCTTCTGGC ATCAGTTTGC TACAGTGAGC 2340 
TCACATCAAA TAGGAAAATA CTTGAAATGC ATGTCTCAAG CTGCAAGGCA AACTCCATTC 2400 

60 CTCATATTAA ACTATTACTT CTCATGACGT CACCATTTTT AACTG ACAGG ATTAGJAAAA 2460 
CATTAAGACA GCAAACTTGT GTCTGTCTCT TCTTTC ATTT TCCCCGCCAC CAACTTACTT 2520 
TACCACCTAT GACTGTACTT GTCAGTATGA GAATTTTTCT GAATCATATT GGGGAAGCAG 2580 
TGATTTTAAA ACCTCAAGTT TTTAAACATG ATTTATATGT TCTGTATAAT GTTCAGTTTG 2640 

, e TAACTTTTTA AAAGTTTGGA TGTATAG AGG G ATAAATAGG AAATATAAGA ATTGGTTATT 2700 

65 TGGGGGCTTT TTTACTTACT GTATTTAAAA ATACAAGGGT ATTGATATGA AATTATGTAA 2760 
ATTTCAAATG CTTATGAATC AAATCATTGT TG AACAAAAG ATTTGTTGCT GTGTAATTAT 2820 
TGTCTTGTAT GCATTTGAGA GAAATAAATA TACCCATACT TATGTTTTAA GAAGTTGAGA 2880 
TCTTGTG AAT ATATGCCTGT CAGTGTCTTC TTTATATATT TATTTTTTAT TAGAAAAAAT 2940 
GAAGTTTGGT TGGTGATGCA TGAAACAAAA TAGCAAGAGA GGGTTATAGT TTAATAGTAA 3000 

70 GGGAGATAAC ACAGCATGTG TAGCACCAGT TGATAATTGG TCTCTAGTAG CTTACTGTCA 3060 
AAATGTTCAA TGAAGTCTTC TGTTCATCTG TTOAAACTAO GAAAATACCC AAACTTAAAT 3120 
GGAAGAATTC TGAAAGAGAG GATAGAATTT AAAGAACAAG AGTATATAAA GTTATTCTTT 3180 
GAATATTTCG TTGACTATAT GTACATTGAG TTATCTATAT TTGTAAACAA ATTAGTCATG 3240 
GAAAATTATT CTATTCCAAA GTCTCCTTTT AGTCTAG ATA ATCATTATTT CATTTTAAAA 3300 

75 TTAGTGTTTT TCATAGTTTG CACTGATGCG TGTATGGATG TGTGTGAGTC AGTGGTAGCT 3360 
TATTTAAAAA GCACCTTATC CTTTCTCCCA TAACCTTTGT ACACTAAAAA ATGAAAG AAT 3420 
TTAGAATGTA TTTG ATG ATA GCATTCTCAC TAAGACACAT G AG AATTTAA CTTTATAACC 3480 
GCGTGAGTTA AGATTTAATT CATAGGTTTT GATGTCATTG TTGAAGTTAT TTGTAATTCA 3540 
G AAACCTTGC TTGTGTG ATA CAT AGTAAGT CTCTTCATTT ATTACTGCTT GCCTGTTGTT 3600 
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ATATCTGG AT TATCAAAAGC AATAGTGCAC CAATTAAOAT GTGCTCAAAT CAOOACTTAA 3660 
ATCATAGGCA CCACA 1 I I 1 1 CATGTCAGAC TAGTTACTTr GTTGATTCTC AGTTACTGTA 3720 
GGCATCAAAA GGCAAAA ATC A 



SEQIDKO:169£ELEia^Ili 
Protein Accession t. NP.004448.1 



II 



31 



41 51 



21 

I I I I I I 
MNNHVSSKPS TMKLKHTINP ILLYFIHFU SLYTILTYIP FYFFSESRQE KSNRIKAKPV 60 
NSKPDSAYRS VNSLDGLASV LYPGCDTLDK VFTYAKNKFK NKRLLGTRBV LNEEDEVQPN 120 
GKIFKKVILG QYNWLSYEDV FVRAFNFGNG LQMLGQKPKT NIAIPCETRA EWMIAAQACF 180 
MYNPQLVTLY ATLGGPATVH ALNETEVTNI ITSKELLQTK LKDIVSLVPR LRHHTVDGK 240 
PPTWSDFPKG IIVHTMAAVE ALGAKASMEN QPHSKPLPSD IAVIMYTSGS TGLPKGVM1S 300 
HSNnAGITG MAER1PELGE EDVYIG YLPL AHVLELSAEL VCLSHGCRIG YSSPQTLADQ 360 
S5KKKGSKG DTSMLKPTLM AAVPEIMDRI YKNVMNKVSE MSSFQRNLFI LAYNYKMEQl 420 
SKGRNTPLCD SFVFRKVRSL LGGNIRLLLC GGAPLSATTQ RFMN1CFCCP VGQGYGLTES 480 
AGAGTISEVW DYNTGRVGAP LVCCEUOKN WEEGGYFNTD KPHPRGEHJ GGQSVTMGYY 540 
KNEAKTKADF SEDENGQRWL CTGDIGEFEP DGCLKIIDRK KDLVKLQAGE YVSLGKVEAA 600 
UCNLPLVDNI CAYANSYHSY VIGFWPNQK ELTELARKKG LKGTWEELCN SCEMENEVLK 660 
VLSEAAISAS LEKFEIPVKI RLSPEPWTPE TGLVTDAFKL KRKELKTHYQ ADIERMYGRK 



nucusc acaj Accession T. 

Coding setjuencei 



SEQ ID N0:170 PCQ7 DNA SEQUENCE 
none found 

3&-1 075{undemned sequence corresponds to start and stop codon) 



30 
35 
40 
45 
50 
55 
60 
65 
70 
75 



AGCAACGACG 
CCTGCTGCTG 
GTGCAACATA 



GTCGAAATGT 
CTTCCGGTGC 
AAACCCTCTG 
GAGCTTCATC 
AAGTTCTCAA 
TTACCCCA6C 
CCTGCTGGCA 
GCACGGGCTG 
CTGCAACGTC 
GAATGCGTCG 
TGCGTGGTAT 
CGACCTGCCC 
CAGCAGOCTC 
GGGCACTGCT 
AQTTATTCCA 
T6CTCAT6G0 
AACTATCTCT 
TGACATGATC 
CACCCTCATT 
AAATAGGCTG 
C6CTGGACCC 
ATGATCTAAC 
ATCAAAACCT 
AAGAAAACTT 
AAGGACTCTG 
CTCATTCTGA 
GAGCCCCTCC 
TACACCTGCC 
ACCTGCCCGT 
GTATGTCCCT 
CTCCAAAGTT 
ACTGGTTTCT 
CTGCACTGTG 
GGTCAGGGTC 
AGACAATTTG 
TGAAACAGTG 
AGCTGTCTCT 
ACACCCTTGC 
ACATTTGTGC 
AGAGGGACTC 
TTCTCTGTGT 
AGGTGTTGTT 
CCACTCCGGG 
AACCTGTTTG 
TGATCCTGTT 



11 
I 

CCGGGCAGCG 
A6CAGC6CC6 
CCAGGCAACT 
CTGCCTGACT 
GGCCCAACCT 
AATGGGTTTG 
CTTTGCTCCA 
TGCGATGGAC 
GAACCCGGCA 
ATCACCTATG 
CTGGTCTTGC 
CAGCACCCTG 
ACCTACAACG 
GAAGTAGGCT 
GACCTTCCTC 
CCCTACCGCT 
CTGAGCGTGG 
GAGCCCAGGG 
AAGTCCATAT 
AAGCTCTTTA 
GCATTCCCCT 
TGTTGTGOGT 
TTTCACATTA 
GGAGAGAGCA 
AATTCTCTCT 
CAGGAGGCCA 
GCTTTGCACA 
TGGACGTGAG 
AAACCATCTA 
GAGCTTTCCT 
CATGAGTTTA 
CTGGCTCTAC 
AGCCAAGGAA 
GTGGCCCACA 
CCCTTAACAC 
ATCACAGGTG 
CACGCTCCTC 
AGGCCTCTCC 
GAGTCAAGAT 
TGTTTGTTTT 
TTTTTTGTTT 
CCCGCTQAGC 
ATTGTTGCAC 
CTCTCTCCCT 
CCAGTCAGCC 
TGGCAAGAAA 
CAGCTGTCAC 
ACGCTAATTA 
CTGTAGACTT 



21 
I 

GGAGCGGCGG 
CGGAGAGCCA 
TCATGTGCAG 
GCTTCGACAA 
TCTTCCCCTG 
AGGACTGTCC 
CCGCCCGCTA 
AGAATAACTG 
GTGGGCAGGT 
CCATCATCGG 
ACCACCAGCG 
TGCTGCTGTC 
TCAATAATGG 
CCOCACOCTC 
CACCGCCCTA 
CCCGGTCCGG 
AAGACACCAG 
ACTCTGAGCC 
GGGTTAATCT 
AGCACCTGTA 
CCTCCCCCAG 
CTTTTCTGTC 
TTCTGTTTCT 
ATGTTTCTGT 
GCTGGGTAGT 
TCACTGGATG 
ATCCTATTTG 
TAACACCCTT 
CCCTGTATAA 
CAGCAGCATA 
TCCAAGTTCT 
AGCCACTTAC 
TGAGGACCTA 
CCCAGCCTGT 
TTGCAAAGTC 
AGAGCCATGT 
TTCCCAAGGT 
CAACATCCCA 
TTTCCATTTG 
TTCCCTTCTA 
TTCCTTTAAC 
CCCGTGATAA 
TTTGAGGTTA 
CCGTGTATAG 
ACAGGGCCCG 
CCACACTGAC 
CCATTCAGAA 
AAACAGAGCC 
TTCTTTCTTT 



31 

I 

CCGCGCCATG 
GCTGCTCCCC 
CAATGGACGG 
GAGTGATGAG 
TOCCAGCGGC 
CGATGGCAGC 
CCACTGCAAG 
TCAAGACAAC 
GTTPGTGACT 
CAGCTCCGTC 
GAAGCGGAAC 
CCGCCTGGTG 
CATCCAGTAT 
CTACTCCGAG 
CTCTTCTGAC 
GAGTGCCAAC 
CCACAGCCCG 
CAGCCAGGGC 
GCTCTGACTT 
ACGATGTCTC 
ACTTCAGAGA 
AGGTCACTCT 



GCTATATTGG 
TACCTTATAG 
GTCAC000CC 
ATGCCCCCAG 
CAGCAGTCGC 
ATTCTGGCTT 
TATCATCAGC 
CAG CTCCTAA 
CTGGTTTCTG 
ACTTGAGTTG 
CTTGCTCATT 
CTTTTTACCT 
TCAATACCTC 
CCCAATACCA 
GTAGTTTCTC 
GATCTATTTT 
GTTAAGGGAC 
AAGGTCCAAA 
CAAGTCACTC 
TTATTTATCA 
TCTCTATGTT 
CCTCCCTGCA 
TGATGAGGGG 
CTTCTTTCCG 
TGCAGGAAGT 
TTTTAACCAA 



41 

I 

TGGCTGCTGG 
GGGAACAACT 
TGCATCCCGG 
AAGGAGTGCC 
ATCCATTGCA 
GATGAAGAGA 
AACGGCCTCT 
AGTGATGAGG 
TCAGAGAACC 
ATTTTTGTGC 
AACCTCATGA 
GTCCTGGACC 
GTGGCCAGCC 
GCCTTGCTGG 
ACGGAATCTC 
AGTGCCAGCT 
GGGCAGCCTG 
ACTGAAGAAG 
GTTGCCA3TC 
AAGTTACAGT 
TGTTTTTCTG 
TCOCTTGGGA 
CAGCATATAA 
ATGCTCAGAA 
CATTTGGGGA 
CAAAAAAATT 
TTCAGCAGAG 
AACGTTATTT 
TAGAAATTTG 
CTCATCCTAA 
AATGCAGGCT 
GACTGTCACC 
GCCCAAAGTC 
CATGCAGCCT 
GTGCATTTGG 
CAGCAAGCTC 
GCACCTCTAG 
CTCTGAGACA 
AAATCTTTTA 
TATTTATATG 
GAAAGATGCA 
CAGACTAACC 
AGTTCTTGAA 
TGTGCTAGTT 
GGAATAAGGG 
TAAAATGGAA 
CAGCTGAAGA 
GGGGCTAAAG 
ATCCAAAGGA 



51 

I 

GGCCGCTGTG 
TCACCAATGA 
GCGCCTGGCA 
CCAAGGCTAA 
TCATTGGTCG 
ACTGCACAGC 
GTATTGACAA 
AAAGCTGTGA 
AACTTGTGTA 
TGGTGGTGGC 
CGCTGCCCGT 
ACCCCCACCA 
AGGCGGAGCA 
ACCAGAGGCC 
TGAACCAAGC 
CCCAGGCAGC 
GCCCCCAGGA 
TA TAAG TCCC 
TAACAATTTG 
TTGGGATATT 
GCGTCTCAGT 
CC0GA6ATCA 
AACAGTATTG 
GTGCAGGAGA 
TTTGGGTTAG 
CCATTTGAGC 
TCAGTGGCCA 
TGGTTTTGTG 
CCCAAGAATG 
AATAGGCAGG 
GCCAAGACCC 
CTCCCAGCTG 
TGACCTGGCT 
CAACACTGGC 
ACTTGAGGAC 
TCCTGGCTCC 
TTAGAGTTAG 
CATGGGCAAG 
GAAATGCATT 
TGTATAGGAA 
AAAGGAGATC 
TGTGTGCCAG 
GGAAGCAGAA 
TTTCTTTTTT 
GTAAAACGTT 
CCAGGTAGAG 
AATGTTCAGT 
TGGCATTCAG 
TGTTACAGAA 



60 
120 
180 
240 
300 
360 
420 
460 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
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AAGCTAGCCA CTGGTATTTT GTTTTGTTTA AAAAAAAAAA GAAAGAAAGA AAQAAAGAAA 3000 

AACGGAAAGG AACCTAGCTG CCTGTATCTT TCATTTTTAA AATAGCACTT GAGTTATTTT 3060 

CTGAGTAATC CAATAAAGAA CTTTTGATGA CAGCCAQAAT GTGTTAGAAC TCTGGCTGAA 3120 

CATTTCATCT CCTGTGAGTC AGAAGGGCTT TATTTCTCCC TTTGATGGGG CCCCTTCTTC 3180 

TTTCTGGTGC TCTGGAAGTT GTTTAGAGGA AAGAATTCTA ATTTTAATTA ATTGCGCAGT 3240 

GAGTTAATCT CACTCGCTTT TCTGCTTCCA GGCATCTTAG GAAAAACAAA TGGTTTTAGT 3300 

AGATAAGGGA TGCCTACTAA TGCTTTTTTA AAACAAACAG GGACATTTTT ATTATAGATT 3360 

TGATTTTTTT AATGAATGTT TTTAAAAATA TATAAATAGG ACACCAAAGC GGCAGGGTTT 3420 

TTTTTGGGGG GAGGGGGTTT GTTTTOCAAC TCAAGATGGC ACATTAGTGG CCAGCAATAT 3480 

TTTTTAACTC ATTCCAACCA GGAAGCTTTT TTATACATTG CCTAAATCTA CGCCAACCAG 3540 

AAAATAGTCT CATCTCTTTT TTTCTCAAAT GAGATCCGTG TTTTATTTTA GCATTAAATT 3600 

AGTTACACTG TGATGACTGG CCTATEACCT GACTCAGCTC CCTCTACCTT GAAATTGACA 3660 

TTTTTAAAAA ATGCAACTAA GTGGTTAATA GTGTGTGACG CTCAAAGTTA ATGTAAACTG 3720 

GAAAGGTTGT GTGTCGTTGC CTTTT 8TBCT TTGGTTAGGC TT GGT TTTCT TTTTTAATTT 3780 

TTATACTTTC TAATAAATTT GCAGTTTCAT TCTTTCTGTT TGTGCAAAWG GWMCTAMABH 3840 

AAMHAAAAAC AWYWTTGGGG GGGCTTGGGC CTCGGAAAAA GTTTTTAACA CCACTTCGGG 3900 

TGGGGCGGCG GGGCCCACGT AGGTACGGCG ACCACGGGGG CCCAAACGGG ACCCCAGAAG 3960 

GAAACCCTGG CCAAGAAAAA GGTGGCGAGA ATTCTCCACA CCAGAAAAAA ACGCGCCGGG 4020 

GGAAACCGCA GAGTGTTGCG TAAACCACAC CCGAAGAGAG AACTCAGAAG CACACAAGCG 4080 
GGACTCAACC AGGAGGACCC AAGGGAACCC GATAGAGTAC G 



Protein Accession #: none found 



1 11 21 31 41 51 

I I I I I I 

MWLLGPLCLL LSSAAESQLL PGNNFTOBCN IFGNFHCSNG RCIPGAWQCD GLPDCFDKSD 60 

EKECPKAKSK CGPTFFPCAS GXHCXIGRFR CNGFEDCPDG SDEENCTANP LLCSTARYHC 120 

KNGLCIDKSP ICDGQNNCOD NSBEESCESS QEPGSGQVFV TSENQLVYYP SITYAIIGSS 180 

VIFVLWALL ALVLHHQRKR NNLMTLPVHR LQHPVLLSRL WLDHPHECN VTYNVNNGIQ 240 

YVASQAEQNA SEVGSPPSYS EALLDQRPAW YDLPPPPYSS DTESLNQADL PPYRSRSGSA 300 
NSASSQAASS LLSVEDTSHS PGQPGPQEGT AEPRDSEPSQ GTEEV 



SEQ ID N0.172 PEL3 DNA SEQUENCE 
NudelcAcW Accession* NM_005658.1 

Coding sequence: 57-1535 (undetlined sequences correspond to start and stop codons) 

l 11 21 31 41 51 

i I I II I 

GTCATATTGA ACATTCCAGA TACCTATCAT TACTCGATGC TGTTGATAAC AGCAAGATGG 60 

CTTTGAACTC AGGGTCACCA CCAGCTATTG GACCTTACTA TGAAAACCAT GGATACCAAC 120 

CGGAAAACCC CTATCCCGCA CAGCCCACTG TGGTCCCCAC TGTCTACGAG GTGCATCCGG 180 

CTCAGTACTA OCOGTCOCCC GTGCCCCAGT ACGCCCCGAG GGTCCTGACG CAGGCTTCCA 240 

ACCCCGTCGT CTGCACGCAG CCCAAATOCC CATCCGGGAC AGTGTGCACC TCAAAGACTA 300 

AGAAAGCACT GTGCATCACC TTGAOCCTGG GGACCTTCCT CGTGGGAGCT GCGCTGGCCG 360 

CTGGCCTACT CTGGAAGTTC ATGGGCAGCA AGTGCTCCAA CTCTGGGATA GAGTGCGACT 420 

CCTCAGGTAC CTGCATCAAC CCCTCTAACT GGTGTGATGG CGTGTCACAC TGCCCCGGCG 480 

GGGAGGACGA GAATCGGTGT GTTCGCCTCT ACGGACCAAA CTTCATCCTT CAGATGTACT 540 

CATCTCAGAG GAAGTCCTGG CACCCTGTGT GCCAAGACGA CTGGAACGAG AACTACGGGC 600 

GGGCGGCCTG CAGGGACATG GGCTATAAGA ATAATTTTTA CTCTAGCCAA GGAATAGTGG 660 

ATGACAGCGG ATCCACCAGC TTTATGAAAC TGAACACAAG TGCCGGCAAT GTCGATATCT 720 

ATAAAAAACT GTACCACAGT GATGCCTGTT CTTCAAAAGC AGTGGTTTCT TTACGCTGTT 780 

TAGOCTGOGG GGTCAACTTG AACTCAAGCC GCCAGAGCAG GATCGTGGGC GGTGAGAGCG B40 

CGCTCCCGGG GGCCTGGCCC TGGCAGGTCA GCCTGCACGT CCAGAACGTC CACGTGTGCG 900 

GAGGCTCCAT CATCACCCCC GAGTGGATCG TGACAGCCGC CCACTGCGTG GAAAAACCTC 960 

TTAACAATCC ATGGCATTGG ACGGCATTTG CGGGGATTTT GAGACAATCT TTCATGTTCT 1020 

ATGGAGCCGG ATACCAAGTA CAAAAAGTGA TTTCTCATCC AAATTATGAC TCCAAGACCA 1080 

AGAACAATGA CATTGCGCTG ATGAAGCTGC AGAAGCCTCT GACTTTCAAC GACCTAGTGA 1140 

AACCAGTGTG TCTGCCCAAC CCAGGCATGA TGCTGCAGCC AGAACAGCTC TGCTGGATTT 1200 

CCGGGTGGGG GGCCACCGAG GAGAAAGGGA AGACCTCAGA AGTGCTGAAC GCTGCCAAGG 1260 

TGCTTCTCAT TGAGACACAG AGATGCAACA GCAGATATGT CTATGACAAC CTGATCACAC 1320 

CAGCCATGAT CTGTGCCGGC TTCCTQCAGG GGAACGTCGA TTCTTGCCAG GGTGACAGTG 1380 

GAGGGCCTCT GGTCACTTCG AACAACAATA TCTGGTGGCT GATAGGGGAT ACAAGCTGGG 1440 

GTTCTGGCTG TGCCAAAGCT TACAGACCAG GAGTGTACGG GAATGTGATG GTATTCACGG 1500 

ACTGGATTTA TCGACAAATG AAGGCAAACG GCTAATCCAC ATGGTCTTCG TCCTTGACGT 1560 

CGTTTTACAA GAAAACAATG GGGCTGGTTT TGCTTCCCCG TGCATGATTT ACTCTTAGAG 1620 

ATGATTCAGA GGTCACTTCA TTTTTATTAA ACAGTGAACT TGTCTGGCTT TGGCACTCTC 1680 

TGCCATACTG TGCAGGCTGC AGTGGCTCCC CTGCCCAGCC TGCTCTCCCT AACCCCTTGT 1740 

CCGCAAGGGG TGATGGCCGG CTGGTTGTGG GCACTGGCGG TCAATTGTGG AAGGAAGAGG 1800 

GTTGGAGGCT GCCCCCATTG AGATCTTCCT GCTGAGTCCT TTCCAGGGGC CAATTTTGGA I860 

TGAGCATGGA GCTGTCACTT CTCAGCTGCT GGATGACTTG AGATGAAAAA GGAGAGACAT 1920 

GGAAAGGGAG ACAGCCAGGT GGCACCTGCA GCGGCTGCCC TCTGGGGCCA CTTGGTAGTG 1980 

TCCCCAGCCT ACTTCACAAG GGGATTTTGC TGATGGGTTC TTAGAGCCTT AGCAGCGCTG 2040 

GATGGTGGCC AGAAATAAAG GGACCAGCCC TTCATGGGTG GTGACGTGGT AGTCACTTGT 2100 

AAGGGGAACA GAAACATTTT TGTTCTTATG GGGTGAGAAT ATAGACAGTG CCCTTGGTGC 2160 
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GAGGGAAGCA ATTGAAAAGG AACTTGCCCT GAGCACTCCT GGTGCAGGTC TCCACCTGCA 2220 

CATTGGGTGG GGCTCCTGGG AGGGAGACTC AGCCTTCCTC CTCATCCTCC CTGACCCTGC 2280 

TCCTAGCACC CTGGAGAGTG AATGCCCCTT GGTCCCTGGC AGGGCGCCAA GTTTGGCACC 2340 

ATGTCGGCCT CTTCAGGCCT GATAGTCATT GGAAATTGAG GTCCATGGGG GAAATCAAGG 2400 

ATGCTCAGTT TAAGGTACAC TGTTTCCATG TTATGTTTCT ACACATTGAT GGTGGTGACC 2460 
CTGAGTTGAA AGCCATCTT 

ggftlP Hfrift PE^ protein, mm% 

Protein Accession* NP.005647.1 

1 11 21 31 41 51 

i I I I II 

MALNSGSPPA IGPYYENHGY QPENPYPAQF TWPTVYEVH PAQYYPSPVP QYAFKVLTQA 60 

SHFWCTQPK SPSGTVCTSK TKKALCITLT LGTFLVGAAL AAGLLWKFKG SKCSNSGIEC 120 

DSSGTCZHPS NWCDGVSHCP GGEDENRCVR LYGPNFILQM YSSQRKSWHP VCQDDWNENY 180 

GRAACRDHGY KNNFYSSQGI VDDSGSTSFM KLNTSAGNVD IYKKLYHSDA CSSKAWSLR 240 

CLACGVKLNS SRQSRXVGGE SALPGAWPWQ VSLHVQNVHV CGGSZITPEW IVTAAHCVBK 300 

PLNNPWHWTA FA6ZLRQSFH FYGAGYQVQK VISHPNYD5K TKNNDIALMK LGKPLTFNDL 360 

VKPVCLPNPG MMLQPEQLCW ISGWGATEEK GKTSEVLNAA KVLLIETQRC KSRYVYDNLI 420 

TPAHXCAGFL QGNVDSCQGD SGGPLVTSNN NTWWLIGDTS WGSGCAKAYR PGVYGNVMVF 480 
TDWIYRQMKA NG 

SEQ ID N0:174 PBJ4 DMA SEQUENCE 

Nucleic Add Accession #: AI694767 

CodnQsequence: 130-1086 (underlined sequences correspond to start and stop codons) 

l 11 21 31 41 51 

I I I I I 1 

CAGAGAGGCT GTATTTCAGT GCAGCCTGCC AGACCTCTTC TGGAGGAAGA CTGGACAAAG 60 

GGGGTCACAC AOTCCTTCCA TACGGTTGAG CCTCTACCTG CCTGGTGCT6 GTCACAGTTC 120 

AGCTTCTTCA TGATGGTGGA TCCCAATGGC AATGAATCCA GTGCTACATA CTTCATCCTA 180 

ATAGGCCTCC CTGGTTTAGA AGAGGCTCAG TTCTGGTTGG CCTTCCCATT GTGCTCCCTC 240 

TACCTTATTG CTGTGCTAGG TAACTTGACA ATCATCTACA TTGT6CGGAC TGAGCACAGC 300 

CTGCATGAGC CCATGTATAT ATTTCTTTGC ATGCTTTCAG GCATTGACAT CCTCATCTCC 360 

ACCTCATCCA TGCCCAAAAT GCTGGCCATC TTCTGGTTCA ATTCCACTAC CATCCAGTTT 420 

GATGCTTGTC TGCTACAGAT GTTTGCCATC CACTCCTTAT CTGGCATGGA ATCCACAGTG 480 

CTGCTGGCCA TGGCTTTTGA CCGCTATGTG GCCATCTGTC ACCCACTGCG CCATGCCACA 540 

GTACTTACGT TGCCTCGTGT CACCAAAATT GGTGTGGCTG CTGTGGTGCG GGGGGCTGCA 600 

CTGATGGCAC CCCTTCCTGT CTTCATCAAG CAGCTGCCCT TCTGCCGCTC CAATATCCTT 660 

TCCCATTCCT ACTGCCTACA CCAAGATGTC ATGAAGCTGG CCTGTGATGA TATCCGGGTC 720 

AATGTCGTCT ATGGCCTTAT CGTCATCATC TCCGCCATTG 6CCT6GACTC ACTTCTCATC 780 

TCCTTCTCAT ATCTGCTTAT TCTTAAGACT GTGTTGGGCT TGACACGTGA AGCCCAGGCC 840 

AAGGCATTTG GCACTTGCGT CTCTCATGTG TCTGCTGTGT TCATATTCTA TGTACOTTTC 900 

ATTGGATTGT CCATGQTGCA TOGCTTTAGC AAGCGGCGTG ACTCTCCACT GCCCGTCATC 960 

TTGGCCAATA TCTATCTGCT GGTTCCTCCT GTGCTCAACC CAATTGTCTA TGGAGTGAAG 1020 

ACAAAGGAGA TTCGACAGCG CATCCTTCGA CTTTTCCATG TGGCCACACA CGCTTCAGAG 1080 

CCCTAGGTST CAGTGATCAA ACTTCTTTTC CATTCAGAGT CCTCTGATTC AGATTTTAAT 1140 

GTTAACATTT TGGAAGACAG TATTCAGAAA AAAAATTTCC TTAATAAAAA TACAACTCAG 1200 

ATCCTTCAAA TATGAAACTG GTTGGGGAAT CTCCATTTTT TCAATATTAT TTTCTTCTTT 1260 

GTTTTCTTGC TACATATAAT TATTAATACC CTGACTAGGT TGTGGTTGGA GGGTTATTAC 1320 

TTTTCATTTT ACCATGCAGT CCAAATCTAA ACTGCTTCTA CTGATGGTTT ACAGCATTCT 1380 

GAGATAAGAA TGGTACATCT AGAGAACATT TGCCAAAGGC CTAAGCACAG CAAAGGAAAA 1440 

TAAACACAGA ATATAATAAA ATGAGATAAT CTAGCTTAAA ACTATAACTT CCTCTTCAGA 1500 

ACTCCCAACC ACATTGGATC TCAGAAAAAT ACTGTCTTCA AAATGACTTC TACAGAGAAG 1560 

AAATAATTTT TCCTCTGGAC ACTAGCACTT AAGGGGAAGA TTGGAAGTAA AGCCTTGAAA 1620 

AGAGTACATT TACCTACGTT AATGAAAGTT GACACACTGT TCTGAGAGTT TTCACAGCAT 1680 

ATGGACCCTG TTTTTCCTAT TTAATTTTCT TATCAACCCT TTAATTAGGC AAAGATATTA 1740 
TTAGTACCCT CATTGTAGCC ATGGGAAAAT TGATGTTCAG TGGGGATCAG TGAATTAAAT *1800 

GGGGTCATAC AAGTATAAAA ATTAAAAAAA AAAGACTTCA TGCCCAATCT CATATGATGT I860 

GGAAGAACTG TTAAAGAGAC CAACAGGGTA GTGGGTTAGA GATTTCCAGA GTCTTACATT 1920 

TTCTARAGGA GGTATTTAAT TTCTTCTCAC TCATCCAGTG TTGTATTTAG GAATTTCCTG 1980 

GCAACAGAAC TCATGGCTTT AATCCCACTA GCTATTGCTT ATTGTCCTGG TCCAATTGCC 2040 

AATTACCTGT GTCTTGGAAG AAGTGATTTC TAGGTTCAOC ATTATGGAAG ATTCTTATTC 2100 

AGAAAGTCTG CATAGGGCTT ATAGCAAGTT ATTTATTTTT AAAAGTTCCA TAGGTGTTTC 2160 

TGATAGGCAG TGAGGTTAGG GAGCCACCAG TTATGATGGG AAGTATGGAA TGGCAGGTGT 2220 

TGAAGATAAC ATTGGCCTTT TGAGTGTGAC TCGTAGCTGG AAAGTGAGGG AATCTTCAGG 2280 

ACCATGCTTT ATTTGGGGCT TTGTGCAGTA TGGAACAGGG ACTTTGAGAC CGGGAAAGCA 2340 

ATCTGACTTA GGCATGGGAA TCAGGCATTT TTGCTTCTGA GGGGCTATTA CCAAGGGTTA 2400 

ATAGGTTTCA TCTTCAACAG GATATGACAA CAGTCTTAAC CAAGAAACTC AAATTACATA 2460 

TACTAAAACA TGTGATCATA TATGTGGTAA GTTTCATTTT CTTTTTCAAT CCTCAGGTTC 2520 

CCTGATATGG ATTCCTATNA CATGCTTTCA TCCCCTTTTG TAATGGATAT CATATTTGGA 2580 

AATGCCTATT TAATACTTGT ATTTGCTGCT GGACTGTAAG CCCATGAGGG CACTGTTTAT 2640 

TATTGAATGT CATCTCTGTT CATCATTGAC TGCTCTTTGC TCATCATTGA ATCCCCCAGC 2700 

AAAGTGCCTA GAACATAATA GTGCTTATGC TTGACACCGG TTATTTTTCA TCAAACCTGA 2760 

TTCCTTCTGT GCTGAACACA TAGCCAGGCA ATTTTCCA6C CTTCTTTGAG TTGGQTATTA 2820 

TTAAATTTTA GCCATTACTT CCAATGTGAG TGGAAGTGAC ATGTGCAATT TTTATACCTG 2880 

GCTCATAAAA CCCTCCCATG TGCAGOCTTT CATGTTGACA TTAAATGTGA CTTGGGAAGC 2940 
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TATGTGTTAC ACAQAGTTAA TTAACCNGAA AGGCCTGGNA ATTTTTTGNN AANNAAACTG 3000 
TGGCCNNGAG GCCCNCAACC CTTTTTNNNA ATTTGGCAAN NTCCCACTTT GTANTTTQGT 3060 
AAGGAGGCCA GTTGGATAAG TGAAAAATAA AGTACTATTG TGTC 



SEQ ID NO:175 PBJ4 PROTEIN SEQUENCE 
Protein Accession ft not avaflable, cloned at Eos 



1 11 21 31 41 51 

I I I I I I 

MVDPNGNSSS ATYFILIGLP GLEEAQFVJLA FPLCSLYLIA VLGNLTIIYT VRTEHSLHEP 
MYIFLCKLSG IDILISTSSH PKMLAIFWN STTIQFDACL LQMFAIHSLS GKESTVLLAM 
AFDRYVAICH PLRHATVLTL PRVTKIGVAA WRGAALMAP LPVFIKQLPF CRSNILSHSY 
CLHQDVMKLA CDDIRVNWY GLIVIISAIG LCSLLISFSY LLILKTVLGL TREAQAKAFG 
TCVSHVCAVP IFYVPFIGLS HVHRFSKRRD SPLPVILANI YLLVPPVLNP IVYGVKTKEI 
RQRILRLFHV ATHASEP 

SEQ 10 NO-.176 PM72 DMA SEQUENCE 
NudefcAddAccesslonft NMJXW624.1 

Coding sequence: 57-1544 (underlined sequences correspond to start and stop codons) 

TCGGAGCCTG CGGAGGGTGG TGGTGGTGGT GGTGGTGGCC CTCGCCCGCC TCACTCATGC 60 

CTCCTCCTCC TCTGCTCTCG CTCAGGCGCC TCGGTGGCGG TTGGTCGGCG GTTACGCGGC 120 

TGGTGGTCGC GGCGGCCGGG GCTCGCTCTC GGGGAGGCCG GGGCGGATCT CGCGGCGCAG 180 

GCGGCGGCGG CCGAGGTGGG GTCGCGCGGC GGAGGCGGCT CGAGCTTCGT GCTGCGCGCT 240 

CGCTCTTGGG CTCCTCGCTG CAGGAGGAGT GTGACTATGT GCAGATGATC GAGGTGCAGC 300 

ACAAGCAGTG CCTGGAGGAG GCCGAGCTGG AGAATGAGAC AATAGGCTGC AGCAAGATGT 360 

GGGACAACCT CACCTGCTGG CCAGCCAOCC CTCGGGGCCA GGTAGTTGTC TTGGCCTGTC 420 

CCCTCATCTT CAAGCTCTTC TCCTCCATTC AAGGCCGCAA TGTAAGCCGC AGCTGCACCG 480 

ACGAAGGCTG GACGCACCTG GAGCCTGGCC CGTACCCCAT TGCCTGTGGT TTGGATGACA 540 

AGGCAGCGAG TTTGGATGAG CAGCAGACCA TGTTCTACGG TTCTGTGAAG ACCGGCTACA 600 

CCATTGGCTA CGGCCTGTCC CTCGCCACCC TTCTGGTCGC CACAGCTATC CTGAGCCTGT 660 

TCAGGAAGCT CCACTGCACG CGGAACTACA TCCACATGCA CCTCTTCATA TCCTTCATCC 720 

TGAGGGCTGC CGCTGTCTTC ATCAAAGACT TGGCCCTCTT CGACAGCGGG GAGTCGGACC 780 

AGTGCTCCGA GGGCTCGGTG GGCTGTAAGG CAGCCATGGT CTTTTTCCAA TATTGTGTCA 840 

TGGCTAACTT CTTCTGGCTG CTGGTGGAGG GCCTCTACCT GTACACCCTG CTTGCCGTCT 900 

CCPTCTTCTC TGAGCGGAAG TACTTCTGGG GGTACATACT CATCGGCTGG GGGGTACCCA 960 

GCACATTCAC CATGGTGTGG ACCATCGCCA GGATCCATTT TGAGGATTAT GGTCTGCTCA 1020 

GGTGCTGGGA CACCATCAAC TCCTCACTGT GGTGGATCAT AAAGGGCCCC ATCCTCACCT 1080 

CCATCTTGGT AAACTTCATC CTGTTTATTT GCATCATCCG AATCCTGCTT CAGAAACTGC 1140 

GGCCCCCAGA TATCAGGAAG AGTGACAGCA GTCCATACTC AAGGCTAGCC AGGTCCACAC 1200 

TCCTGCTGAT CCCCCTGTTT GGAGTACACT ACATCATGTT CGCCTTCTTT CCGGACAATT 1260 

TTAAGCCTGA AGTGAAGATG GTCTTTGAGC TCGTCGTGGG GTCTTTCCAG GGTTTTGTGG 1320 

TGGCTATCCT CTACTGCTTC CTCAATGGTG AGGTGCAGGC GGAGCTGAGG CGGAAGTGGC 1380 

GGCGCTGGCA CCTGCAGGGC GTCCTGGGCT GGAACCCCAA ATACCGGCAC CCGTCGGGAG 1440 

GCAGCAACGG CGCCACGTGC AGCACGCAGG TTTCCATGCT GACCCGCGTC AGCCCAGGTG 1500 

CCCGCCGCTC CTCCAGCTTC CAAGCCGAAG TCTCCCTGGT CTGACCACCA GGATCCCAGC 1560 

CCAAGCGGCC CCTCCCGCCC CTTCCCACTC GCAGCAGACG CCGGGGACAG AGGCCTGCCC 1620 

GGGCGCGCCA GCCCCGGCCC TGGGCTCGGA GGCTGCCCCC GGCCCCCTGG TCTCTGGTCC 1680 

GGACACTCCT AGAGAACGCA GCCCTAGAGC C1GCCTGGAG CGTTTCTAGC AAGTGAGAGA 1740 

GATGGGAGCT CCTCTCCTGG AGGATGCAGG TGGAACTCAG TCATTAGACT CCTCCTCCAA 1800 

AGGCCCCCTA CGCCAATCAA GGGCAAAAAG TCTACATACT TTCATCCTGA CTCTGCCCCC 1860 

TGCTGGCTCT TCTGCCCAAT TGGAGGAAAG CAACCGGTGG ATCCTCAAAC AACACTGGTG 1920 

TGACCTGAGG GCAGAAAGGT TCTGCCCGGG AAGGTCACCA GCACCAACAC CACGGTAGTG 1980 

CCTGAAATTT CACCATTGCT GTCAAGTTCC TTTGGGTTAA GCATTACCAC TCAGGCATTT 2040 

GACTGAAGAT GCAGCTCACT ACCCTATTCT CTCTTTACGC TTAGTTATCA GCTTTTTAAA 2100 

GTGGGTTATT CTGGAGTTTT TGTTTGGAGA GCACACCTAT CTTAGTGGTT CCCCACCGAA 2160 

GTGGACTGGC CCCTGGGTCA GTCTGGTGGG AGGACGGTGC AACCCAAGGA CTGAGGGACT 2220 

CTGAAGCCTC TGGGAAATGA GAAGGCAGCC ACCAGCGAAT GCTAGGTCTC GGACTAAGCC 2280 

TACCTGCTCT CCAAGTCTCA GTGGCTTCAT CTGTCAAGTG GGACTCTGTC ACACCAGCCA 2340 

TTCTTATCTC TCTGTGCTGT GGAAGCAACA GGAATCAAGA GACTGCCCTC CTTGTCCACC 2400 

CACCTATGTG CCAACTGTTG TAACTAGGCT CAGAGATGTG CACCCATGGG CTCTGACAGA 2460 

AAGCAGATCC TCACCCTGCT ACACATACAG GATTTGAACT CAGATCTGTC TGATAGGAAT 2520 

GTGAAAGCAC GGACTCTTAC TGCTAACTTT TGTGTATCGT AACCAGCCAG ATCCTCTTGG 2580 

TTATTTGTTT ACCACTTGTA TTATTAATGC CATTATCCCT GAATTCCCCT TGCCACCCCA 2640 

CCCTCCCTGG AGTGTGGCTG AGGAGGCCTC CATCTCATGT ATCATCTGGA TAGGAGCCTG 2700 

CTGGTCACAG CCTCCTCTGT CTGCCCTTCA CCCCAGTGGC CACTCAGCTT CCTACCCACA 2760 

CCTCTGCCAG AAGATCCCCT CAGGACTGCA ACAGGCTTGT GCAACAATAA ATGTTGGCTT 2820 
GGAAAAAAAA AAAA 

SEQ ID NO:177 PM7g Projein sequence: 

Protein Accession ft JC2195 

1 11 21 31 41 51 

I I I I I I 

MPPPPLLSLR RLGGGWSAVT RLWAAAGAR SRGGRGGSRG AGGGGRGGVA RRRRLELRAA 60 

RSLLGSSLQE ECDYVQMIEV QHKQCLEEAQ LENETIGCSK MWDNLTCWPA TPRGQVWLA 120 
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CPLIFKLFSS IQGRNVSRSC TDEGWTHLEP GPYPIACGLD DKAASLDBQQ TMFYGSVKTG 180 

YTIGYQLSLA TLLVATAILS LFRKLHCTRN YTHMHLFISP ZLRAAAVFIK DLALPDSGES 240 

DQCSEGSVGC KAAMVFPQYC VMANFFWLLV EGLYLYTLIA VSFPSERKYF WGYILIGWGV ,300 

c PSTFTHVWTI ARIHFEDYGL LRCWDTINSS LWWIIKGPIL TSILVNFILP ICIIRILLQK 360 

5 LRPPDIRKSD SSPYSRLARS TLLLIPLFGV HYIMPAFFPD KFKPBVKMVF ELWGSFQGF 420 

WAILYCFLN GEVQAELRRK WRRWHLQGVL GWNPKYRHPS GGSNGATCST QVSMLTRVSP 480 
GARRSSSFQA EVSLV 

SEQ ID N&178 BFF8 DNA SEQUENCE 

10 Nucleic Add Accessions AL133619 

Coding sequence: 1-2070 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

K I I I I I I 

15 ATQAGCGGTG CGGGGGTGGC GGCTGGGACG CGGCCCCCCA GCTCGCCGAC CCCGGGCTCT 60 

CGGCGCCGGC GCCAGCGCCC CTCTGTGGGC GTCCAGTCCT TGAGGCCGCA GAGCCCGCAG 120 

CTCAGGCAGA GCGACCCGCA GAAACGGAAC CTGGACCTGG AGAAAAGCCT GCAGTTCCTG 180 

CAGCAGCAGC ACTCGGAGAT GCTGGCCAAG CTCCATGAGG AGATCGAGCA TCTGAAGCGG 240 

A GAAAACAAGG GTGAGCCGGC GCGGGGCCCT AGCCCGGCCC TGCCTCCCCA GGCACACTCA 300 

2U ACACTGCCGC TCCCGCAGCA CAGAAACACA GCCATCAACT CCAGCACACG CCTGGGCTCA 360 

GGGGGAACAC AGGACGGGGA GCCCCTCCAG ACTGTCCTTG CCCACCTGGC TGCACT GGCC 420 

CCTGTATGCC AACCCAGTGG GTACAGGTTC TGGGGGACCT GGACAGATGC CGCTACCTCT 480 

AGCCGTGGCT GGACGATGTT ATGCAGCCAA GCACAGCACG TGCTGCTCTC GGGAAGCCCA 540 

GGGCCTGAGG TCATTGCAGG GCGGCAGGTG GCCACAGGGT GCTCCCCAGA CCTCCCTCCT 600 

25 CCAAGTAGAG CTGAAATGGG AAGGAACCCC TGGGACAGCC OCTGCCCTGC TAGATCTTTG 660 

CCTCAGATTG CTGCTGTGGC CAGGCCCAGG ATTTCCAGCC CTATGGCTCT GAGTCCTCAC 720 

ATGCTGGGGG CCCAGGGGAT ATGGACACAC TCCATCCAGG GATCCCTTCC TGCCATCTGG 780 

GCAGCAACCA TGGGGACAAA GGGAGGAAGC AGAGTCCTGT TTCCTTGCCA CTTGTCCAAG B40 

0 _ GCACTTCCCC ATCCTGACAG CGGOCCCGAC CCAGCCCAGG ATCGTGGGCT GTGGTCTCAA 900 

30 GCTCACTTOC CATTATCTTT GGGGCTGGGG CTGACATCAG GAGGACATCT GACTGGTGGA 960 

TGGAGCCAGC CTGGGAACAT CGCAGCTGGG GCAGTGCCTA GGGCTCTCCC TTCCCAGGGA 1020 

GACATGGAGA AGGGGGTTGA GGGAGGGCCC TTCCCTAGCC GCTGTGGCAA CTCCAGTGAG 1080 

CTGTTCTGGG CAAAGTGTGG CCCAAGTCGG CAGCCCCAGC CCTGCAGTGC TGGGGACGCT 1140 

c GACAGGACAC GGGAAGAGGC CATGCTTTCC CTCGGGACCT GCTGTTCCAT GTGTCCCAAG 1200 

35 CCCTCCTGCT TTCCAGATGG CCCCTCAGGA AACCACCTTT CCAGGGCCTC TGCTCCCTTG 1260 

GGCGCTCGCT GGGTCTGCAT CAACGGAGTG TGGGTAGAGC CGGGAGGAOC CAGCCCTGCC 1320 

AGGCTGAAGG AGGGCTCCTC ACGGACACAC AGGCCAGGAG GCAAGCGTGG GCGTCTTGCG 1380 

GGCGGTAGCG CCGACACTGT GCGCTCTCCT GCAGACAGCC TCTCCATGTC AAGCTTCCAG 1440 

TCTGTCAAGT CCATCTCTAA TTCAGCCAAC TCTCAAGGCA AGGCCAGGCC CCAGCCCGGC 1500 

40 TCCTTCAACA AGCAAGATTC AAAAGCTGAC GTCTCCCAGA AGGCGGACCT GGAAGAGGAG 1560 

CCCCTACTTC ACAACAGCAA GCTGGACAAA GTTCCTGGGG TACAAGGGCA GGCCAGAAAG 1620 

GAGAAAGCAG AGGCCTCTAA TGCAGGAGCT GCCTGTATGG GGAACAGCCA GCACCAGGGC 1680 

AGGCAGATGG GGGCGGGGGC ACACCCCCCA ATGATCCTGC CCCTTCCCCT GCGAAAGCCC 1740 

ACCACACTTA GGCAGTGCGA AGTGCTCATC CGCGAGCTGT GGAATACCAA CCTCCTGCAG 1800 

45 ACCCAAGAGC TGCGGCACCT CAAGTCCCTC CTGGAAGGGA GCCAGAGGCC CCAGGCAGCC 1860 

CCGGAGGAAG CTAGCTTTCC CAGGGACCAA GAAGCCACGC A3TTCCCCAA GGTCTCCACC 1920 

AAGAGCCTCT CCAAGAAATG CCTGAGCCCA CCTGTGGCGG AGCGTGCCAT CCTGCCCGCA 1980 

CTGAAGCAGA CCCCGAAGAA CAACTTTGCC GAGAGGCAGA AGAGGCTGCA GGCAATGCAG 2040 
AAACGGCGCC TGCATCGCTC AGTGCTTTGA 
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SEQIDNft178j 
Protein Accession *. T43457 



55 l 11 21 31 41 51 

I I I I I I 

MSGAGVAAGT RPPSSPTPGS RRRRQRPSVG VQSLRPQSPQ LRQSDPQKRN LDLEKSLQFL 60 

QQQHSEHLAK LHEBXEHLKR ENKGEPARGP RPALPPQAHS TLPLPQHRNT AINSSTRLGS 120 

GGTQDGEPLQ TVLAHLAAIA PVCQPSGYRP WGTWTDAATS SRGWMLCSQ AOHVLLSGSP 180 

60 GPEVIAGRQV ATGCSPDLPP PSRAEHGRNP WDSPCPARSL PQIAAVARPR ISSPMALSPH 240 

MLGAQGIWTH SIQGSLPAIW AATMGTKGGS RVLFPCHLSK ALPHPDSGPH PAQDPGLWS0 300 

AHFPLSLGLG LTSGGHLTGG WSQPGNIAAG AVPRALPSQG DMEKGVEGGP FPSRCGNSSE 360 

LFWAKCGPSR QPQPCSAGDA DRTREEAMLS LGTCCSMCPK PSCFPDGPSG NHLSRASAPL 420 

GARWVCINGV WVEPGGPSPA RLKEGSSRTH RPGGKRGRLA GGSADTVRSP ADSLSHSSFQ 480 

65 SVKSISNSAN SQGKARPQPG SPNKQDSKAD VSQKADLEEE PLLHNSKLDK VPGVQGQARK 540 

EKAEASNAGA ACMGNSQHQG RQMGAGAHPP KILPLPLRKP TTLRQCEVLI RELWNTNLLQ 600 

TQELRHLKSL LEGSQRPQAA PEEASFPRDQ EATHFPKVST KSLSKKCLSP PVAERAILPA 660 

LKQTPKNNFA erqkrlqamq krrlhrsvl 



SEQ 10 NO:180 BCR4 DNA SEQUENCE 
Nucleic Acid Accession*: NM.012319.2 

Coding sequence: 138-2405 (underlined sequences correspond to start and stop cottons) 

1 11 21 31 41 51 

I I I I I I 

CTCGTGCCGA ATTCGGCACG AGACCGCGTG TTCGCGCCTG GTAGAGATTT CTCGAAGACA 60 

CCAGTGGGCC CGTGTGGAAC CAAACCTGCG CGCGTGGCCG GGCCGTGGGA CAACGAGGCC 120 
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GCGGAGACGA AGGCGCAATG GCGAGGAAGT TATCTGTAAT CTTGATCCTG ACCTTTGCCC 180 

TCTCTGTCAC AAATCCCCTT CATGAACTAA AAGCAGCTGC TTTCCCCCAG ACCACTGAGA 240 

AAATTAGTCC GAATTGGGAA TCTGGCATTA ATGTTGACTT GGCAATTTCC ACACGGCAAT 300 

ATCATCTACA ACAGCTTTTC TACCGCTATG GAGAAAATAA TTCTTTGTCA GTTGAAGQGT 360 

TCAGAAAATT ACTTCAAAAT ATAGGCATAO ATAAQATTAA AAGAATCCAT ATACACCATG 420 

ACCACGACCA TCACTCAGAC CACGAGCATC ACTCA6ACCA TGAGCGTCAC TCAGACCATG 480 

AGCATCACTC AGACCACGAO CATCACTCTG ACCATGATCA TCACTCTCAC CATAATCATG 540 

CTGCTTCTGG TAAAAATAAG CGAAAAGCTC TTTGCCCAGA CCATGACTCA GATAGTTCAG 600 

GTAAAGATCC TAQAAACAGC CAGG66AAA6 GAGCTCACCG ACCAGAACAT GCCAGTGGTA 660 

GAAGGAATGT CAAGGACAGT GTTAGTGCTA GTGAAGTGAC CTCAACTGTG TACAACACTG 720 

TCTCTGAAGG AACTCACTTT CTAGAGACAA TAGAGACTCC AAGACCTGGA AAACTCTTCC 780 

CCAAAGATGT AAGCAGCTCC ACTOCACCCA GTGTCACATC AAAGAGCCGG GTGAGCCGGC 840 

TGGCTGGTAG GAAAACAAAT GAATCTGTGA GTGAGCCCCG AAAAGGCTTT ATGTATTCCA 900 

GAAACACAAA TGAAAATCCT CAGGAGTGTT TCAATGCATC AAAGCTACTG ACATCTCATG 960 

GCATGGGCAT CCAGGTTCCG CTGAATGCAA CAGAGTTCAA CTATCTCTGT CCAGCCATCA 1020 

TCAACCAAAT TGATGCTAGA TCTTGTCTGA TTCATACAAG TGAAAAGAAG GCTGAAATCC 1080 

CTCCAAAGAC CTATTCATTA CAAATAGCCT GGGTTOGTGG TTTTATAGCC ATTTCCATCA 1140 

TCAGTTTCCT GTCTCTGCTG GGGGTTATCT TAGTGCCTCT CATGAATCGG GTGTTTTTCA 1200 

AATTTCTCCT GAGTTTCCTT GTGGCACTGG CCGTTGGGAC TTTGAGTGGT GATGCTTTTT 1260 

TACACCTTCT TCCACATTCT CATGCAAGTC ACCACCATAG TCATAGOCAT GAAGAACCAG 1320 

CAATGGAAAT GAAAAGAGGA CCACTTTTCA GTCATCTGTC TTCTCAAAAC ATAGAAGAAA 1380 

GTGCCTATTT TGATTCCACG TGGAAGGGTC TAACAGCTCT AGGAGGOCTG TATTTCATGT 1440 

TTCTTGTTGA ACATGTCCTC ACATTGATCA AACAATTTAA AGAIAAGAAG AAAAAGAATC 1500 

AGAAGAAACC TGAAAATGAT GATGATGTGG AGATTAAGAA GCAGTTGTCC AAGTATGAAT 1560 

CTCAACTTTC AACAAATGAG GAGAAAGTAG ATACAGATGA TCGAACTGAA GGCTATTTAC 1620 

GAGCAGACTC ACAAGAGCCC TCCCACTTTG ATTCTCAGCA GCCTGCAGTC TTGGAAGAAG 1680 

AAGAGGTCAT GATAGCTCAT GCTCATCCAC AGGAAGTCTA CAATGAATAT GTACCCAGAG 1740 

GGTGCAAGAA TAAATGCCAT TCACATTTCC ACGATACACT CGGCCAGTCA GACGATCTCA 1800 

TTCACCACCA TCATGACTAC CATCATATTC TCCATCATCA CCACCACCAA AACCACCATC 1860 

CTCACAGTCA CAGCCAGCGC TACTCTCGGG AGGAGCTGAA AGATGCCGGC GTCGCCACTT 1920 

TGGCCTGGAT GGTGATAATG GGTGATGGCC TGCACAATTT CAGCGATGGC CTAGCAATTG 1980 

GTGCTGCTTT TACTGAAGGC TTATCAAGTG GTTTAAGTAC TTCTGTTGCT GTGTTCTGTC 2040 

ATGAGTTGCC TCATGAATTA GGTGACTTTG CTGTTCTACT AAAGGCTGGC ATGACCGTTA 2100 

AGCAGGCTGT CCTTTATAAT GCATTGTCAG CCATGCTGGC GTATCTTGGA ATGGCAACAG 2160 

GAATTTTCAT TGGTCATTAT GCTGAAAATG TTTCTATGTG GATATTTGCA CTTACTGCTG 2220 

GCTTATTCAT GTATGTTGCT CTGGTTGATA TGGTACCTGA AATGCTGCAC AATGATGCTA 2280 

GTGACCATGG ATGTAGCCGC TGGGGGTATT TCTTTTTACA GAATGCTGGG ATGCTTTTGG 2340 

GTTTTGGAAT TATGTTACTT ATTTCCATAT TTGAACATAA AATCGTGTTT CGTATAAATr 2400 

T CTAGT TAAG GTTTAAATGC TAGAGTAGCT TAAAAAGTTG TCATAGTTTC AGTAGGTCAT 2460 

AGGGAGATGA GTTTGTATGC TGTACTATGC AGCGTTTAAA GTTAGTGGGT TTTGTGATTT 2520 

TTGTATTGAA TATTGCTGTC TGTTACAAAG TCAGTTAAAG GTACGTTTTA ATATTTAAGT 2580 

TATTCTATCT TGGAGATAAA ATCTGTATGT GCAATTCACC GGTATTACCA GTTTATTATG 2640 

TAAACAAGAG ATTTGGCATG ACATGTTCTG TATGTTTCAG GGAAAAATGT CTTTAATGCT 2700 

TTTTCAAGAA CTAACACAGT TATTCCTATA CTGGATTTTA GGTCTCTGAA GAACTGCTGG 2760 

TGTTTAGGAA TAAGAATGTG CATGAAGCCT AAAATACCAA GAAAGCTTAT ACTGAATTTA 2820 

AGCAAAGAAA TAAAGGAGAA AAGAGAAGAA TCTGAGAATT GGGGAGGCAT AGATTCTTAT 2880 

AAAAATCACA AAATTTGTTG TAAATTAGAG GGGAGAAATT TAGAATTAAG TATAAAAAGG 2940 

CAGAATTAGT ATAGAGTACA TTCATTAAAC ATTTTTGTCA GGATTATTTC CCGTAAAAAC 3O00 

GTAGTGAGCA CTCTCATATA CTAATTAGTG TACATTTAAC TTTGTATAAT ACAGAAATCT 3060 

AAATATATTT AATGAATTCA AGCAATATAC ACTTGACCAA GAAATTGGAA TTTCAAAATG 3120 

TTCGTSCGGG TTATATACCA GATGAGTACA GTGAGTAGTT TATGTATCAC CAGACTGGGT 3180 

TATTGCCAAG TTATATATCA CCAAAAGCTG TATGACTGGA TGTTCTGGTT ACCTGGTTTA 3240 

CAAAATTATC AGAGTAGTAA AACTTTGATA TATATGAGGA TATTAAAACT ACACTAAGTA 3300 

TCATTTGATT CGATTCAGAA AGTACTTTGA TATCTCTCAG TGCTTCAGTG CTATCATTGT 3360 

GAGCAATTGT CTTTATATAC GGTACTGTAG CCATACTAGG CCTGTCTGTG GCATTCTCTA 3420 
GATGTTTCTT TTTTACACAA TAAATTCCTT ATATCAGCTT G 



SEQ ID H0:181 BCR4 PROTEIN SEQUENCE 
Protein Accession*: NPJB6451 



1 11 21 31 41 51 

I I I I I I 

MARKLSVILI LTFALSVTNP LHELKAAAFP QTTEKISPNW ESGINVDLAI STRQYHLQQL 60 

FYRYGENNSL SVEGPRKLLQ NIGIDKIKRI HIHHDHDHHS DHEHHSDHER HSDHEHHSDH 120 

EHHSDHDHHS HHNHAASGKN KRKALCPDHD SDSSGKDPRN SQGKGAHRPE HASGRRNVKD 180 

SVSASEVTST VYKTVSEGTH FLETIETPRP GKLFPKDV5S STPPSVTSKS RVSRLAGRKT 240 

NESVSEPRKG FMYSJRNTNEK PQECPMASKL LTSEGHGIQV PLMATEFNYL CPAIIKQIDA 300 

RSCLIHTSEK KAEIPPKTYS LQIAHVGGFI AISIISFLSL LGVILVPUIK RVFFKFLLSF 360 

LVALAVGTLS GDAFLHLLPH SHASHHHSHS HEEPAHEMKR GPLFSHLSSQ NIEESAYFDS 420 

TOKGLTALGG LYFMFDVEHV LTLIKQFKDK KKKNQKKPEN DDDVEIKKQL SKYESQLSTN 480 

BEKVDTDDRT EGYLRAD5QE PSEFD5QQPA VLEEEEVMXA HAKPQBVYKE YVPRGCKNKC 540 

HSHPHDTLGQ SDDLIHHHHD YHHILHHHHH QNHHPHSHSQ RYSREELKDA GVATLAWMVI 600 

KGDGlrHNFSD GLAIGAAFTE GLSSGLST8V AVFCHELPHE LGDFAVLLXA GKTVKQAVLY 660 

NALSAKLAYL GKATGXFIGH YAENVSMWIF ALTAGLFMYV ALVDMVPEHL HNDASDKGCS 720 
RWGYFFLQNA GMLI/3FGIML LISIPEHKTV FRINF 
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SEQ(DNO;182BCY2PNAgeWgnC9 

Nucleic Acid Accession* NMJXM203 

Coding sequence: 274-1782 (underlined sequences correspond to start and stop coders) 

1 11 21 31 41 51 

CGCGGGGOGC GGAGTCGGCG GGGCCTCGCO GOACGCGGGC AGTGCGGAGA COGCGGCGCT 60 
GAGGACGCGG GAGCCGGGAG CGCACGCGCG GGGTGGAGTT CAGCCTACTC TTTCTTAGAT 120 
GTGAAAGGAA AGGAAGATCA TTTCATGCCT TGTTGATAAA GGTTCAOACT TCTGCTGATT 180 
CATAAGCATT TGGCTCTG AG CTATGACAAG AGAGGAAACA AAA AGTTAAA CTTACAAGOC 240 
TGCCATAAGT GAGAAGCAAA CTTCCTTGAT AACATGCTTT TGCGAAGTGC AGGAAAATTA 300 
AATGTGGGCA CCAAGAAAGA GGATGGTGAG AGTACAGCCC CCACCCCCCG TCCAAAGGTC 360 
TTGCGTTGTA AATGCCACCA CCATTGTCCA GAAGACTCAG TCAACAATAT TTGCAGCACA 420 
GACGGATATT GTTTCACGAT GATAGAAG AG GATGACTCTG GGTTGCCTGT GGTCACTTCT 480 
GGTTGCCTAG GACTAGAAGG CTGAG ATTTT CAGTGTDGGG ACACTCCCAT TOCTCATCAA 540 
AGAAGATCAA TTGAATGCTG CACAG AAAGG AACGAATGTA ATAAAGAGCT ACACCCTACA 600 
CTGCCTOCAT TGAAAAACAG AGATTTTGTT GATGGACCTA TACACCACAG GGCTTTACTT 660 
ATATCTGTGA CTGTCTGTAG TTTGCTCTTG GTCCTTATCA TATTATTTTG TTACTTCCGG 720 
TATAAAAGAC AAG AAACCAG ACCTCGATAC AGCATTGGGT TAGAACAGGA TGAAACTTAC 780 
ATTCCTCCTG GAG AATCCCT GAGAGACTTA ATTGAGCAGT CTCAGAGCTC AGGAAGTGGA 840 
TCAGGCCTCC CTCTGCTGGT CCAAAGGACT ATAGCTAAGC AGATTCAGAT GGTGAAACAG 900 
ATTGGAAAAG GTCGCTATGG GGAAGTTTGG ATGGGAAAGT GGCGTGGCGA AAAGGTAGCT 960 
GTG AAAGTGT TCTTCACCAC AG AGG AAGCC AGCTGGTTCA GAGAGACAGA AATATATCAG 1020 
ACAGTGTTGA TGAGGCATGA AAACATTTTG GGTTTCATTG CTGCAGATAT CAAAGGGACA 1080 
GGGTCCTGGA CCCAGTTGTA CCTAATGACA GACTATCATG AAAATGGTTC CCTTTATGAT 1 140 
TATCTGAAGT OCACCACCCT AGACGCTAAA TCAATGCTG A AGTTAGCCTA CTCTTCTGTC 1200 
AGTGGCTTAT GTCATTTACA CACAGAAATC TTTAGTACTC AAGGCAAACC AGCAATTGCC 1260 
CATCGAGATC TGAAAAGTAA AAACATTCTG GTGAAGAAAA ATGGAACTTG CTGTATTGCT 1320 
GACCTGGGCC TGGCTGTTAA ATTTATTAGT GATACAAATG AAGTTGACAT ACCAOCTAAC 1380 
ACTCGAGTTG GCACCAAACG CTATATGCCT CCAGAAGTGT TGGACGAGAG CTTGAACAGA 1440 
AA7XACTTCC AGTCTTACAT CATGGCTG AC ATGTATAGTT TTGGCCTCAT CCnTGGGAG 1500 
GTTGCTAGGA G ATGTGTATC AGGAGGTATA GTGG AAGAAT ACC AGCTTCC TTATCATGAC 1560 
CTAGTGCCCA GTGACCCCTC TTATGAGGAC ATGAGGGAGA TTGTGTGCAT CAAGAAGTTA 1620 
CGCCCCTCAT TCCCAAACCG GTGGAGCAGT GATGAGTGTC TAAGGCAGAT GGGAAAACTC 1680 
ATGACAGAAT GCTOGGCTCA CAATCCTGCA TCAAGGCTGA CAGCCCTGCG GGTTAAGAAA 1740 
ACACTTGCCA AAATGTCAGA GTCCCAGGAC ATTAAACTCLGATAGGAGAG GAAAAGTAAG 1800 
CATCTCTGCA GAAAGCCAAC AGGTACTCTT CTGTTTGTGG GCAGAGCAAA AGACATCAAA 1860 
TAAGCATCCA CAGTACAAGC CTTGAACATC GTCCTGCTTC CCAGTGGGTT CAGACCTCAC 1920 
CTTTCAGGGA GCGACCTGGG CAAAGACAGA GAAGCTCCCA GAAGGAGAGA TTGATOCGTG 1980 
TCTGTTTGTA OGCGGAG AAA CCGTTGGGTA ACTTGTTCAA GATATGATGC AT 

SEQ ID NQ:183BCY2Proteb sequence 

Protein Accession*: NP.001194 



1 11 21 31 41 51 
I I I I I I 

MLLRS AGKLN VGTKKEDGES TAPTPRPKVL RCKCHHHCPE DSVNNICSTD GYCFIMIEED 60 
DSGLPWTSG CLGLBGSDFQ CRDTPIPHQR RSIECCTERN ECNKDLHPTL PPLKNRDFVD 120 
GPIHHRALLI SVTVCSOLV LOLFCYFRY KRQETRPRYS IGLEQDETYI PPGESLRDU 180 
EQSQSSGSGS GLPLLVQRTI AKQIQMVKQI GKGRYGEVWM GKWRGEKVAV KVFFTTEEAS 240 
WFRETEIYQT VLMRHENILG F1AADIKGTG SWTQLYLTTD YHENGSLYDY LKSTTLDAKS 300 
MLKLAYSS VS GLCHLHTEIF STQGKPAIAH RDLKSKNILV KKNGTCQAD LGLAVKFISD 360 
TNBVDIPPNT RVGTKRYMPP EVIDESLNRN HFQSYIMADM YSFGULWEV ARRCVSGGIV 420 
EEYQLPYHDL VPSDPSYEDM RHVCKKLR PSFPNRWSSD ECLRQMGKLM TECWAHNPAS 480 
RLTALRVKKT LAKMSESQDI KL 



SEQ H) NO: 184 C8F9 DNA sequence 

Nucleic Add Accession*: AC0Q5383 

Cooing Sequence: 328-2751 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I i I I I I 

GACAGTGTTC GCGGCTGCAC CGCTCGGAGG CTGGGTGACC CGCGTAQAAG TGAAGTACTT 60 

TTTTATTTGC AGACCTGGGC CGATGCCGCT TTAAAAAACG CGAGGGGCTC TATGCACCTC 120 

CCTGGCGGTA GTTCCTCCGA CCTCAGCCGG GTCGGGTCGT GCCGCCCTCT CCCAGGAGAG 180 

ACAAACAGGT GTCCCACGTG GCAGCCGCGC CCCGGGCGCC CCTCCTGTGA TCCCGTAGCG 240 

CCCCCTGGCC CGAGCCGCGC CCGGGTCTGT GAGTAGAGCC GCCCGGGCAC CGAGCGCTGG 300 

TCGCCGCTCT CCTTCCGTTA TATCAACATG CCCCCTTTCC TGTTGCTGGA GGCCGTCTGT 360 

gmTCCTGT TTTCCAGAGT GCCCCCATCT CTCCCTCTCC AGGAAGTCCA TGTAAGCAAA 420 

GAAACCATCG GGAAGATTTC AGCTGCCAGC AAAATGATGT GGTGCTCGGC TGCAGTGGAC 480 

ATCATGTTTC TGTTAGATGG GTCTAACAGC GTCGGGAAAG GGAGCTTTGA AAGGTCCAAG 540 

CACTTTGCCA TCACAGTCTG TGACGGTCTG GACATCAGCC CCGAGAGGGT CAGAGTGGGA 600 

GCATTCCAGT TCAGTTCCAC TCCTCATCTG GAATTCCCCT TGGATTCATT TTCAACCCAA 660 
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CA6GAAGTGA AGGCAAGAAT CAAGAGGATG GTTTTCAAAO QAG6GCGCAC GGAQACG6AA 720 

CTTGCTCTGA AATACCTTCT GCACAGAGGG TTGCCTGGAG GCAGAAATGC TTCTGTGCCC 780 

CAGATCCTCA TCATCGTCAC TGATGGGAAG TCCCAGGGGG ATGTGGCACT GCCATCCAAG 840 

CAGCTGAAGG AAAGGGGTGT CACTGTGTTT GCTGTGGGGG TCAGGTTTCC CAGGTCGGAG 900 

5 GAGCTGCATG CACTCGCCAG CGAGCCTAGA GGGCAGCACG TGCTGTTGGC TGAGCAGGTG 960 

GAGGATGCCA CCAACGGCCT CTTCAGCACC CTCAGCAGCT CGGCCATCTG CTCCAGCGCC 1020 

ACGCCAGACT GCAGGGTCGA GGCTCACCCC TGTGAGCACA GGACGCTGGA GATGGTCCGG 1080 

GAGTTCGCTG GCAATOCCCC ATGCTGGAGA GGATCGCGGC GGACCCTTGC GGTGCTGGCT 1140 

GCACACTGTC CCTTCTACAG CTGGAAGAGA GTGTTCCTAA CCCAOCCTGC CACCTGCTAC 1200 

.0 AGGACCAOCT GCCCAGGCCC CTGTGACTCG CAGCCCTGCC A GAATG GAGG CACATGTGTT 1260 

CCAGAAGGAC TGGACGGCTA CCAGTGCCTC TGCCCGCTGG CCTTTGGAGG GGAGGCTAAC 1320 

TGTGCCCTGA AGCTGAGCCT GGAATGCAGG GTCGACCTCC TCTTCCTGCT GGACAGCTCT 1380 

GCGGGCACCA CTCTGGACGG CTTCCTGCGG GCCAAAGTCT TCGTGAAGCG GTTTGTGCGG 1440 

GCCGTGCTGA GCGAGGACTC TCGGGCCCGA GTGGGTGTGG CCACATACAG CAGGGAGCTG 1500 

5 CTGGTGGCGG TGCCTGTGGG GGAGTACCAG GATGTGCOTG ACCTGGTCTG GAGCCTCGAT 1560 

GGCATTCCCT TCCGTGGTGG CCCCACCCTG ACGGGCAGTG CCTTGCGGCA GGCGGCAGAG 1620 

OGTGGCTTCG GGAGCGCCAC CAGGACAGGC CAGGACCGGC CACGTAGAGT GGTGGTTTTG 1680 

CTCACTGAGT CACACTCCGA GGATGAGGTT GCGGGCCCAG CGCGTCACGC AAGGGCGCGA 1740 

GAGCTGCTCC TGCTGGGTGT AGGCAGTGAG GCCGTGCGGG CAGAGCTGGA GGAGATCACA 1800 

10 GGCAGCCCAA AGCATGTGAT GGTCTACTCG GATCCTCAGG ATCTGTTCAA CCAAATCCCT 1860 

GAGCTCCAGG GGAAGCTGTG CAGCCGGCAG CGGCCAGGGT GCCGGACACA AGCCCTGGAC 1920 

CTCGTCTTCA TGTTCGACAC CTCTGCCTCA GTAGGGCCCG AGAATTTTGC TCAGATGCAG 1980 

AGCTTTGTGA GAAGCTGTGC CCTCCAGTTT GAGGTGAACC CTGACGTGAC ACAGGTCGGC 2040 

CTGGTGGTGT ATGGCAGCCA GGTGCAGACT GCCTTCGGGC TGGACACCAA ACCCACOCGG .2100 

15 GCTGCGATGC TGCGGGCCAT TAGCCAGGCC CCCTACCTAG QTGGGGTGGG CTCAGCCGGC 2160 

ACCGOCCTGC TGCACATCTA TGACAAAGTG ATGAOCGTCC AGAGGGGTGC CCGG CCTGG T 2220 

GTCCCCAAAG CTGTGGTGGT GCTCACAGGC GGGAGAGGCG CAGAGGAT6C AGOCGTTCCT 2280 

GCCCAGAAGC TGAGGAACAA TGGCATCTCT GTCTTGGTCG TGGGOGTGGG GCCTGTCCTA 2340 

AGTCAGGGTC TGOGGAGGCT TGCAGGTCCC CGGGATTCCC TGATCCACGT GGCAGCTTAC 2400 

\0 GCCGACCTGC GGTACCACCA GGACGTGCTC ATTGAGTGGC TGTGTGGAGA AGCCAAGCAG 2460 

CCAGTCAACC TCTGCAAACC CAGCCCGTGC ATGAATGAGG GCAGCTGCGT CCTGCAGAAT 2520 

GGGAGCTACC GCTGCAAGTG TCGGGATGGC TGGQAGGGCC CCCACTGCGA GAACCGTGAG 2580 

TGGAGCTCTT GCTCTGTATG TGTGAGCCAG GGATGGATTC TTGAGACGCC CCTGAGGCAC 2640 

ATGGCTCCCG TGCAGGAGGG CAGCAGCCGT ACCCCTCCCA GCAACTACAG AGAAGGCCTG 2700 

)5 GGCACTGAAA TGGTGCCTAC CTTCTGGAAT GTCTGTGOOC CAGGTCC TTA G AATGTCTGC 2760 

TTCCCGCCGT GGCCAGGACC ACTATTCTCA CTGAGGGAGG AGGATGTOCC AACTGCAGCC 2820 

ATCCTGCTTA GAGACAAGAA AGCAGCTGAT GTCACCCACA AACGATGTTG TTGAAAAGTT 2880 

TTGATGTGTA AGTAAATAOC CACTTTCTGT ACCTGCTGTG CCTTGTTGAG GCTATGTCAT 2940 

CTGCCACCTT TCCCTTGAGG ATAAACAAGG GGTCCTGAAG ACTTAAATTT AGCGGCCTGA 3000 

(0 CGTTCCTTTG CACACAATCA ATGCTCGCCA GAATGTTGTT GACACAGTAA TGCCCAGCAG 3060 

AGGCCTTTAC TAGAGCATCC TTTGGACGGC GAAGGCCACG GCCTTTCAAG ATGGAAAGCA 3120 

GCAGCTTTTC CACTTCCCCA GAGACATTCT GGATGCATTT GCATTGAGTC TQAAAGGGGG 3180 

CTTGAGGGAC GTTTGTGACT TCTTGGCGAC TGCCTTTTGT GTGTGGAAGA GACTTGGAAA 3240 

GGTCTCAGAC TGAATGTGAC CAATTAACCA GCTTGGTTGA TGATGGGGGA GGGGCTGAGT 3300 

15 TGTGCATGGG CCCAGGTCTG GAGGGCCACG TAAAATCGTT CTGAGTCGTG AGCAGTGTCC 3360 

ACCTTGAAGG TCTTC 

figQ ID MMasCBFBProteln sequence 
Prateto Accession* nonelound 

50 

1 11 21 31 41 51 

| | I I I I 

MPPFLLLEAV CVFLFSRVPP SLPLQEVHVS KETIGKISAA SKMHWCSAAV DIMFLLDGSN 
55 SVGKGSFERS KHFAITVCDG LDISPBRVHV GAFQPSSTPH LEFPLDSPST QQBVKARIKR 

HVPKGGRTET EIALKYLLHR GLPGGRNASV PQILUVTOG KSQGDVALPS KQLKERGVTV 

FAVGVRFPRW EELHALASBP RGQHVLIAEQ VEDAHJGLFS TLSSSAICSS ATPDCRVEAH 

PCEHRTLEMV REPAGNAPCW RGSRRTLAVL AAHCPPYSWK RVFLTHPATC YRTTCPGPCD 

SQPCQNGGTC VPEGLDGYQC LCPLAFGGEA NCALKLSLEC KVDLLFLLDS SAGTTUX5FL 
50 RAKVPVKRPV RAVLSEDSRA RVGVATYSRE LLVAVPVGEY QDVPDLVWSL DG1PFRGGPT 

LTGSALRQAA ERGFGSATRT GQDRPRKVW LLTESHSEDE VAGPARHARA RELLLLGVGS 

EAVRABLEEI TGSPKHVKVY SDPQDLFNQI PELQGKLCSR QRPGCRTQAL DLVPMLDTSA 

SVGPENPAQM QSFVRSCALQ FEVNPDVTQV GLWYGSQVQ TAPGLDTKPT RAAWLRAISQ 

APYLGGVGSA GTALLHIYDK VMTVQRGARP GVPKAVWLT GGHGAEDAAV PAQKUttWGI 
65 SVLWGVGPV LSBGLRRLAG PRDSLIHVAA YADLKYHQDV LIEWLCGEAK QPVNLCKPSP 

CMNEGSCVLQ NGSYRCKCRD GWEGPHCENR EWSSCSVCVS QGWILETPLR HMAPVQEGSS 

RTPPSNYREG LGTEMVPTFW NVCAPGP 



70 SEQIDNO:186PAV1 DNA sequence 

Nucleic Add Accession ft AF272890 * 

Coding Sequence: 87-1520 (underlined sequences correspond to start and stop cooons) 



60 
120 
180 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 



1 11 21 31 41 51 

| j ! I I I 

TGCTACCCGC GCCCGGGCTT CTGGGGTGTT CCCCAACCAC GGCCCAGCCC TGCCACACCC 
CCCGCCCCCG GCCTCCGCAG CTCGGCATCG GCGOGGGGGT GCTCGTCCTG GGCGCCTCCG 
AGCCCGGTAA CCTGTOGTCG GCCGCACCGC TCCCCGACGG CGCGGCCACC GCGGCGCGGC 



377 



WO 02/30268 



PCT/US01/32045 




TGCTGCCTOC 
GTCTGCTGAT 
CCATCGCCAA 
6CGCCGACCT 
GCCGCTGGQA 
TGACG6CCAG 
CGCCCTTCCG 
TGTGGGCCAT 
AGAGCGACGA 
GGGCCTACGC 
TCGTGTACCT 
AGCGCCGTTT 
CGCCCGCQCC 
CCAACGGGCG 
CGCTCAAGAC 
TGGCCAACGT 
TCAACTGGCT 
ACTTCCGCAA 
ACGCGACCCA 
CATCGCCOGG 
CGCGCCTGCT 
GCCTGGACGA 
GGGGCGCGGA 
CCGATAGCAG 
AAGCCACGGA 
ATGTTCCTTG 




GCCGCCCGGA 
TGCGGGTAAG 
GCTGGGCATC 
GGTGAAGGCC 
GGGCTACGCC 
GGCCTTCCAG 
CGGAGACCGG 
GGCCGCCTCG 
GGAGCCCTGG 
GCCGTGCCGC 
CTCCGGGCAC 
GTGAACTCGA 
CCGTPGCACA 
TTG 



AGCCCCGAGC 
GTGCTGCTCA 
CTGCAGACGC 
CTGCTGGTGG 
TTCTTCTGCG 
CTGTGTGTCA 
CTGCTGACGC 
GTGTCCTTCC 
TGCTACAACG 
TCCGTAGTCT 
CGCGAGGCCC 
CCAGCGCGGC 
CCCCCGCGCC 
CGGCGGCCCT 
ATCATGGGCG 
TTCCACCGCG 
AACTCGGCCT 
GGACTGCTCT 
CCGCGCGCCT 
GACGACGACG 
GCCGGCTGCA 
CCCGGCTTCG 
GGCTTCCCAG 
AGCCCACAAT 
AAAAGGAAAG 



240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
10B0 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 



Protein Accession ft 



SEQ |D Nfr187 PAV1 Pftiein, sequence 
AA011176 



1 
I 

MGAGVLVLGA 
MGLLHALIVL 
WGRWEYGSFF 
TVWAISALVS 
AFVYLRVFRB 
LANGRAGKRR 
FFNWLGYANS 
PPSPGAASDD 



11 

I 

SEPGNLSSAA 
LIVAGNVLVI 
CELWTSVDVL 
PLPILMHWWR 
AQKQVKKIDS 
PSRLVALREQ 
APNPIIYCRS 
DDDDWGATP 



21 
I 

PLPDGAATAA 
VAIAKTPRLQ 
CVTASIETLC 
AESDEARRCY 
CERRFLGGPA 
KALKTLGIIM 
PDFRKAFQGL 
PARLLEPWAG 



31 

I 

RLLVPASPPA 
TLTNLFIMSL 
VTALDRYLAI 
NDPKCCDFVT 
RPPSPSPSPV 
GVFTLCWLPF 
LCCARRAARR 



41 

I 

5LLPPASESP 
ASADLYMGLL 
TSPFRYQSLL 
NRAYAIASSV 
PAPAPPPGPP 
FLANWKAFH 



CNGGAAADSD SSLDEPCRPG 



51 
I 

EPLSQQWTAG 
WPFGATIW 
TRARARGLVC 
VSFYVPLCIM 
RPAAAAATAP 
RBLVPDRIiFV 
ASGCZtARPGP 
FASESKV 



60 
120 
180 
240 
300 
360 
420 



Nudeic Acid Accession ft AJ40C877 
Cooing sequence: 



81-3080 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 
I I I I I I 

GGCGTCCGCG CACACCTCCC CGCGCCGCCG CCGCCACCGC CCGCACTCCG CCGCCTCTGC 60 
COjCAACCGC tgagccatcc atgggggtcg cgggccgcaa CCGTCCCGGG GCGGCCTGGG 120 
CGGTGCTGCT GCTGCTGCTG CTGCTGOCGC CACTGCTGCT GCTGGCGGGG GCCGTCCCGC 180 
CGGGTCGGGG CCGTGCCGCG GGGCCGCAGG AGGATGTAGA TGAGTGTGCC CAAGGGCTAG 240 
ATGACTGCCA TGCCGACGCC CTGTGTCAGA ACACACCCAC CTCCTACAAG TGCTCCTGCA 300 
AGCCTGGCTA CCAAGGGGAA GGCAGGCAGT GTGAGGACAT CGATGAATGT GGAAATGAGC 360 
TCAATGGAGG CK5TGTCCAT GACTGTTTGA ATATTCCAGG CAATTATCGT TGCACTTGTT 420 
TTGATGGCTT CATGTTGGCT CATGACGGTC ATAATTGTCT TGATGTGGAC GAGTGCCTGG 480 
AGAACAATGG CGGCTGCCAG CATACCTGTG TCAACGTCAT GGGGAGCTAT GAGTGCTGCT 540 
GCAAGGAGGG GTTTTTCCTG AGTGACAATC AGCACACCTG CATTCACCGC TCGGAAGAGG 600 
GCCTGAGCTG CATGAATAAG GATCACGGCT GTAGTCACAT CTGCAAGG AG GCCCCAAGGG 660 
GCAGCGTCGC CTGTGAGTGC AGGCCTGGTT TTGAGCTGGC CAAGAACCAG AGAGACTGCA 720 
TCTTGACCTG TAACCATGGG AACGGTGGGT GCCAGCACTC CTGTGACGAT ACAGCCGATG 780 
GCCCAGAGTG CAGCTGCCAT CCACAGTACA AGATCCACAC AGATGGGAGG AGCTGCCTTG 840 
AGCGAGAGGA CACTGTCCTG GAGGTGACAG AGAGCAACAC CACATCAGTG GTGGATGGGG 900 
ATAAACGGGT GAAACGGCGG CTGCTCATGG AAACGTGTGC TGTCAACAAT GGAGGCTGTG 960 
ACCGCACCTG TA AGGATACT TCGACAGGTG TCCACTGCAG TTGTCCTGTT GGATTCACTC 1020 
TCCAGTTGGA TGGGAAGACA TGTAAAGATA TTGATGAGTG CCAGACCCGC AATGGAGGTT 1080 
GTGATCATTT CTGCAAAAAC ATCGTGGGCA GTTTTGACTG CGGCTGCAAG AAAGGATTTA 1 140 
AATTATTAAC AGATGAGAAG TCTTGCCAAG ATGTGGATGA GTGCTCTTTG GATAGGACCT 1200 
GTGACCACAG CTGCATCAAC CACCCTGGCA CATTTGCTTG TGCTTGCAAC CGAGGGTACA 1260 
CCCTGTATGG CTTCACCCAC TGTGGAGACA CCAATGAGTG CAGCATCAAC AACGGAGGCT 1320 
GTCAGCAGGT CTGTGTGAAC ACAGTGGGCA GCTATGAATG CCAGTGCCAC CCTGGGTACA 1380 
AGCTCCACTG GAATAAAAAA GACTGTGTGG AAGTGAAGGG GCTCCTGCCC ACAAGTGTGT 1440 
CACCCCGTGT GTCCCTGCAC TGCGGTAAGA GTGGTGGAGG AGACGGGTGC TTCCTCAG AT 1500 
GTCACTCTGG CATTCACCTC TCTTCAGATG TCACCACCAT CAGGACAAGT GTAACCTTTA 1560 
AGCTAAATGA AGGCAAGTGT AGTTTGAAAA ATGCTGAGCT GTTTCCCGAG GGTCTGCGAC 1620 
CAGCACTACC AGAGAAGCAC AGCTCAGTAA AAGAGAGCTT CCGCTACGTA AACCTTACAT 1680 
GCAGCTCTGG CAAGCAAGTC CCAGGAGCCC CTGGCCGACC AAGCACCCCT AAGGAAATGT 1740 
TTATCACTGT TGAGTTTGAG CTTGAAACTA ACCAAAAGGA GGTGACAGCT TCTTGTGACC 1800 
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TGAGCTGCAT CGTAAAGCOA ACCOAGAAGC GGCTCOGTAA AGCCATOCGC ACGCTCAGAA 1860 
AGGCCGTCCA CAGGQAGCAG TTTCACCTCC AGCTCTCAGG CATGAAOCTC GAOGTGGCTA 1920 
AAAAGCCTCC CAGAACATCT GAACGCCAGG CAGAGTCCTG TGGAGTGGGC CAGGGTCATG 1980 
CAGAAAAOCA ATGTGTCAGT TGCAGGGCTG GGAOCTATTA TGATGGAGCA CGAGAACGCT 2040 
GCATTTTATG TOCAAATGGA AGCTTOCAAA ATGAGGAAGG ACAAATG ACT TGTG AAOCAT 2100 
GCCCAAG ACC AGGAAATTCT GGGGCCCTGA AGACGCCAGA AGCTTGGAAT ATGTCTOAAT 2160 
GTGGAGGTCT GTGTCAACCT GGTGA ATATT CTGCAGATGG CTTTGCACCT TGCCAGCTCT 2220 
GTGCCCTGGG CACGTTCCAG CCTGAAGCTO GTCGAACTTC CTGCTTCCCC TGTGGAGGAG 2280 
GCCTTGCCAC CAAACATCAG GGAGCTACTT CCTTTCAGGA CTGTGAAACC AG AGTTCAAT 2340 
GTTCACCTGG ACATTTCTAC AACACCACCA CTCACCGATG TATTCGTTGC CCAGTGGGAA 2400 
CATACCAGCC TGAATTTGGA AAAAATAATT GTGTTTCTTG CCCAGGAAAT ACTACGACTG 2460 
ACTTTGATGG CTCCACAAAC ATAACCCAGT GTAAAAACAG AAGATGTGGA GGGGAGCTGG 2520 
GAGATTTCAC TGGGTACATT GAATCCCCAA ACTACCCAGG CAATTACCCA GCCAACACCG 2580 
AGTGTACGTG GACCATCAAC CCACCCCCCA AGCGCCGCAT CCTGATCGTG GTCCCTGAGA 2640 
TCTTOCTGCC CATAGAGGAC GACTGTGGGG ACTATCTGGT GATGCGG AAA ACCTCTTCAT 2700 
CCAATTCTGT G ACAACATAT GAAACCTGCC AG ACCTACG A ACGCCCCATC GCCTTCACCT 2760 
CCAGGTCAAA GAAGCTGTGG ATTCAGTTCA AGTCCAATGA AGGG AACAGC GCTAGAGGGT 2820 
TCCAGGTCCC ATACGTGACA TATGATGAGG ACTACCAGGA ACTCATTGAA GACATAGTTC 2880 
GAG ATGGCAG GCTCTATGCA 7XTCAGAACC ATCAGGAAAT ACTTAAGGAT AAGAAACTTA 2940 
TCAAGGCTCT GTTTGATGTC CTGGCCCATC CCCAGAACTA TTTCAAGTAC ACAGCCCAGG 3000 
AGTCCCGAGA GATGTTTOCA AGATCGTTCA TCCGATTGCT ACGTTCCAAA GTGTCCAGGT 3060 
TTTTGAGACC TTACAAA7GA CTCAG CCCAC GTGCCACTCA ATACAAATGT TCTGCTATAG 3120 
GGTTGGTGGG ACAGAGCTGT CTTCCTTCTG CATGTCAGCA CAGTCGGGTA TTGCTGCCTC 3180 
CCGTATCAGT GACTCATT AG AGTTCAATTT TTATAG ATAA TACAG ATATT TTGGTAAATT 3240 
GAACTTGGTT TTrcTTTCCC AGCATCGTCG ATGTAG ACTG AGAATGGCTT TG AGTGGCAT 3300 
CAGCTTCTCA CTGCTGTGGG CGGATGTCTT GGATAGATCA CGGGCTGGCT GAGCTGGACT 3360 
TTGGTCAGCC TAGGTGAGAC TCACCTGTCC TTCTGGGGTC TTACTCCTCC TCAAGGAGTC 3420 
TGTAGTGGAA AGGAGGCCAC AGAATAAGCT GCTTATTCTG AAACTTCAGC TTCCTCTAGC 3480 
OCGGCCCTCT CTAAGGGAGC CCTCTGCACT CGTGTGCAGG CTCTGACCAG GCAGAACAGG 3540 
CAAGAGGGGA GGG AAGG AG A CCCCTGCAGG CTCCCTCCAC CCAOCTTGAG ACCTGGGAGG 3600 
ACTCAGTTTC TCCACAGCCT TCTCCAGOCT GTGTGATACA AGTITGATCC CAGGAACTTG 3660 
AGTTCTAAGC AGTGCTCGTG AAAAAAAAAA GCAGAAAGAA TTAGAAATAA ATAAAAACTA 3720 
AGCACTTCTG GAGACAT 

SEQ tD NO:189 BCQ2 Protetn sequence 

Protein Accession «: CAB92285 



1 U 21 31 41 51 
I I I I I I 

MGVAGRNRPG AAWAVLIXLL LLPPLLLLAG A VPPGRGRAA GPQEDVDECA QGLDDCHADA 60 
LCQNTPTSYK CSCKPGYQGE GRQCEDIDEC GNELNGGCVH DCLNIPGNYR CTCFDGFMLA 120 
HDGHNCLDVD ECLENNGGCQ HTCVNVMGSY ECCCKEGFFL SDNQHTC3HR SEEGLSCMNK 180 
DHGCSHICKE APRGSVACEC RPGFELAKNQ RDQLTCNHG NGGCQH5CDD TADOPECSCH 240 
PQ VKMHTDGR SCLEREDTVL EVTESNTTSV VDGDKRVKRR LLMETCA VNN GGCDRTCKDT 300 
STGVHCSCPV GFTLQLDGKT CKDIDECQTR NGGCDHFCKN IVGSFDCGCK KGFKLLTDEK 360 
SCQDVDECSL DRTCDHSON HPGTFACACN RGYTLYGFTH CGDTNECSIN NGGCQQVCVN 420 
TVGSYECQCH PGYKLHWNKK DCVEVKGLLP TSVSPRVSLH CGKSGGGDGC FLRCHSG1HL 480 
SSDVTTIRTS VTFKLNEGKC SLKNAELFPE GLRPALPEKH SSVKESFRYV NLTCSSGKQV 540 
PGAPGRPSTP KEMFITVEFE LETNQKEVTA SCDLSCIVKR TEKRLRKAIR TLRKAVHREQ 600 
FHLQLSGMNL DVAKKPPRTS ERQAESCGVG QGHAENQCVS CRAGTYYDGA RERCILCPNG 660 
TFQNEEGQMT CEFCPRFGNS GALKTPEAWN MSECGGLOQP GEYSADGFAP CQLCALGTFQ 720 
PEAGRTSCFP CGGGLATKHQ GATSFQDCET RVQCSPGHFY NTTTHRCIRC PVGTYQPEPG 780 
KNNCVSCPGN TTTDFDGSTN ITQCKNRRCG GELGDFTGYI ESPNYPGNYP ANTECTWTIN 840 
PPPKRRHJV VPEIFLPIED DCGDYLVMRK TSSSNSVTTY ETCQTYERPI AFTSRSKKLW 900 
IQFKSNEGNS ARGFQ VFYVT YDEDYQEUE DIVRDGRLYA SENHQEILKD KKUKALFDV 960 
LAHPQNYFKY TAQESREMFP RSF1RLLRSK VSRFLRPYK 

Kudeic Add Accession*: ATO07170 

Cooing sequence: 1-1725 (underflned sequences correspond to stop codon) 

1 11 21 31 41 51 

* I 1 1 I I ^ 

AAGGAGGCGG CCTCCGGGAA AAGCGACCGC AGGACTCCTG AGAGCAGCCT CCATGAGGCC 60 
CTGGACCAGT GCATGACCGC CCTGGACCTC TTCCTCACCA ACCAGTTCTC AGAAGCACTC 120 
AGCTACCTCA AGCCCAG AAC CAAGGAAAGC ATCTACCACT CACTGACATA TGCCACCATC 180 
CTGGAGATGC AGGCCATGAT GAOCTTTGAC CCTCAGGACA TCCTGCTTGC CGGCAACATG 240 
ATGAAGGAGG CACAGATGCT GTGTCAGAGG CACCGGAGGA AGTCTTCTGT AACAGATTCC 300 
TTCAGCAGCC TGGTGAACCG CCCCACGCTG GGCCAATTCA CTGAAGAAGA AATCCACGCT 360 
GAGGTCTGCT ATGCAGAGTG CCTGCTGCAG CGAGCAGCCC TGACCTTCCT GCAGGACGAG 420 
AACATGGTGA GCTTCATCAA AGGCGGCATC AAAGTTCGAA ACAGCTACCA GACCTACAAG 480 
GAGCTGGACA GCCTTGTTCA GTCCTCACAA TACTGCAAGG GTGAGAACCA CCCGCACTTT 540 
GAAGGAGGAG TGAAGCTTGG TGTAGGGGCC TTCAACCTGA CACTGTCCAT GCTTCCTACT 600 
AGGATCCTGA GGCTG1 1 G GA GTTTGTGGGG TTTTCAGGAA ACAAGGACTA TGGGCTGCTG 660 
CAGCTGG AGG AGGGAGCGTC AGGGCACAGC TTCCGCTCTG TGCTCTGTGT CATGCTCCTG 720 
CTGTGCTACC ACACCTTCCT CAOCTTOGTG CTCGGTACTG GGAACGTCAA CATCG AGG AG 780 
GCCG AG AAGC TCTTG AAGCC CTACCTGAAC CGGTACGCTA AGGGTGCCAT CTTCCTGTTC 840 
TTTGCAGGGA GGATTGAAGT CATTAAAGGC AACATTG ATG CAGCCATGCG GCGTTTCGAG 900 
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GAGTGCTGTG AGGCCCAGCA GCACTGGAAG CAGTTTCACC ACATGTGCTA CTGGGAGCTG 960 
ATGTGGTGCTTCAOCTACAA GGGCCAGTGG AAGATGTOCT ACTTCTACGC OGACCTGCTC 1020 
AGCAAGGAGA ACTGCTGGTC CAAGGCCACC TACATTTACA TGAAGGCCGC CTACCTCAGC 1080 
_ ATGTTTGGGA AGGAGGACCA CAAGCCGTTC GGGGACGACG AAGTGGAATT ATTTCGAGCT 1 140 
5 GTGCCAGGCC TGAAGCTCAA GATTGCTGGG AAATCTCTAC CCACAGAGAA GTTTGCCATC 1200 
OGGAAGTCCC GGCGCTACTT CTCCTCCAAC CCTATCTCGC TGCCAGTGCC TGCTCTGGAA 1260 
ATGATGTACA TCTGGAACGG CTAOGCCGTG ATTGGGAAGC AGCOGAAACT CACGGATGGG 1320 
ATACTTGAGA TTATCACTAA GGCTGAAGAG ATGCTOOAGA AAGGCCCAGA GAACGAGTAC 1380 

^ TCAGTGGATG ACQ AGTGCTT GGTGAAATTG TTGAAAGGCC TGTGTCTG AA ATACCTGGGC 1440 

10 CGTGTCCAGG AGGCCGAGGA GAATTTTAGG AGCATCTCTG GCAATGAAAA GAAGATTAAA 1500 
TATGACCACT ACTTGATCCC AAACGCCCTG CTGGAGCTGG CCCTGCTGCT TATGGAGCAA 1560 
GACAGAAACG AAGAGGCCAT CAAACTTTTG GAATCTGCCA AGCAAAACTA CAAGAATTAC 1620 
TCCATGGAGT CAAGGACACA CTTTCGAATC CAGGCAGCCA CACTCCAAGC CAAGTCTTCC 1680 

. CTAGAGAACA GCAGCAOATC CATGGTCTCA TCAGTGTCCT TGTAGCTTTG TGCAGCAGTT 1740 

15 CCGGGCTGGA AGACAGAGAC AGCTGGACAG AGCTCCTGAA AACATTTCAA AATACCCCCT 1800 
CCCCCTGCCC TGCCCTGCCT TTGGGGTCCA CCGGCACTCC AGTTGGATGG CACAACATAG I860 
TCTATOCGTG CAGAAGCOG A GCTGGCATTT TCACCAGTGT AGCCAAGGGC CTTTGOCAAG 1920 
GGCAG AGCAG GTGGAGCOCT CTGCCTGCCC TATCACACAT ACGGGTACTT GCTCTTCACT 1980 
GTGATGTTTA AGAGAATGTA TGAACAGTTT ACATTTTCCT TAGAAATACA TTGATGGGAT 2040 

20 CACAGTTGGC TTTAAAAACC AACAACAATC AACCACCTGT AAGTCTTTGT CTTCACCTAT 2100 
TATCATCTGG AGGTAAATCT CTTTATATG A TGATGCCAAA GGGCAAATTG CTTTTCAAAT 2160 
TCAGCAAGTT CTCAGCTTGT GTGACGGAAG GTCCTTCAGA GGACCTGAGG AATGCCTGGG 2220 
AG AGGCTAAG CCTCAGGCTT CAATGCTTCT GGGGTTGGGC ATGAGGATGT ACACAGACAC 2280 
CCACTACCTT ACTACTCACA CTTCATTTCA CTCCTTTTGT AAATTTCCAA TTTAAAAATC 2340 

25 AAGCAOGTCT TTTTAGTGAG ATAAAATCTG AGCTCTTCTG TAGAAAAATC AATCTCTACC 2400 
AGTAGAAAAT GCCAGGGCTT GATGGAAGAG CTGTGTAGCC CITTCTATGC CAAAGCCAGG 2460 
AAATTTGGGG GGCAGGAGGA GGTTCTCAGA ATCCAGTCTG TATCTTTGCT GTATGCCAAA 2520 
CTGAAAGCAC TGGGAATAAT TTATGAAACA TAAAAATCTT CTGTACTICA CTCCAAGGTA 2580 
CATTTGCTTA CTGACAGCAT TTTTCTTAAA ACTGTTA1TC TTGAAAAAAA AAAAAAAAAA 2640 

30 AA 

SEQ [D NQ-.191 BFG1 Protein sequence 

Protein Accession #: AAC39582 

35 

1 U 21 31 41 51 

MTAlLlFLTN QESEALSYLK PRTKESMYHS LTY ATILEMQ AMMTFDPQDI LLAGNMMKEA 60 
QMLCQRHRRK SSVTDSF5SL VNRPTLGQFT EEEIHAEVCY AECLLQRAAL TFLQDENMVS 120 

40 FIKGGIKVRN SYQTYKELDS LVQSSQYCKG ENHPHFEGGV KLGVGAFNLT LSMLPTRILR 180 
LLEFVGFSGN KDYGIJjQLEE GASGHSFRSV LCVMLLLCYH TFLTFVLGTG NVNIEEAEKL 240 
LKFYLNRYFK. GAIFLFFAGR lEVKGNTOA AIRRFEECCE AQQHWKQFHH MCYWELMWCF 300 
TYKGQWKMSY FYADLLSKEN CWSKATYIYM KAAYLSMFGK EDHKPFGDDE VELFRAVPGL 360 
KLKIAGKSLP TEKFAIRKSR RYFSSNPISL PVPALEMMYI WNGYAVIGKQ PKLTDGILEI 420 

45 ITKAEEMLEK. GPENEYSVDD ECLVKLUCGL OiCYLGRVQE AEENFRSBA NEKKKYDHY 480 
UPNALLELA LLtMEQDRNE EAKLLESAK QNYKNYSMES RTHFRIQAAT LQAKSSLENS 540 
SRSMVSSVSL 



50 gEQIPN^PFQgPNAswwnce 

Nudeic Add Accession #: NM.0325B3 

Codino sequence: 1-4044 (undenlned sequences correspond to start and stop coders) 



1 11 21 31 41 51 
55 | | | | | | 

ATOACTAGGA AG AGGACATA CTGGGTGCCC AACTCTTCTG GTGGCCTOGT GAATCGTGGC 60 
ATCGACATAG GCGATGACAT GGTTTCAGGA CTTATTTATA AAACCTATAC TCTCCAAGAT 120 
GGCCCCTGGA GTCAGCAAGA GAGAAATCCT GAGGCTCCAG GGAGGGCAGC TGTfJCCACCG 180 
TGGGGGAAGT ATGATGCTGC CTTGAG AACC ATGATTCCCT TCCGTCCCAA GCCGAGGTTT 240 

60 CCTGCCCCCC AGCCCCTGGA CAATGCTGGC CTGrTCTCCT ACCTCACCGT GTCATGGCTC 300 
ACCCCGCTCA TGATCCAAAG CTTAGGGAGT CGCTTAGATG AGAACACCAT CCCTOCACTG 360 
TCAGTCCATG ATGCCTCAG A CAAAAATGTC CAAAGGCTTC ACCGCCTTTG GGAAGAAGAA 420 
GTCTCAAGGC GAGGGATTGA AAAAGCTTCA GTGCTTCTGG TGATGCTGAG GTTCCAGAGA 480 
ACAAGGTTGA TTTTCGATGC ACTTCTGGGC ATCTGCTTCT GCATTGCCAG TGTACTOGGG 540 

65 CCAATATTGA TTATAOCAAA GATCCTGGAA TATTCAGAAG AGCAGTTGGG GAATGTTGTC 600 
CATGGAGTGG GACTCTGCTT TGCCCTTTTT CTCTCCGAAT GTGTGAAGTC TCTGAGTTTC 660 
TCCTCCAGTT GGATCATCAA CCAACGCACA GCCATCAGGT TCOGAGCAGC TGTTTCCTCC 720 
TTTGCCTTTG AGAAGCTCAT CCAATTTAAG TCTGTAATAC ACATCACCTC AGGAGAGGGC 780 
ATCAGCTTCT TCACCGGTGA TCTAAACTAC CTGTTTGAAG GGGTGTGCTA TGGAOCCCTA 840 

70 GTACTGATCA CCTGCGCATC GCTGGTCATC TGCAGCATTT CTTCCTACTT CATTATTGGA 900 
TACACTGCAT TTATTGCCAT CTTATGCTAT CTCCTGGTTT TCCCACTGGC GGTATTCATG 960 
ACAAGAATGG CTGTGAAGGC TCAGCATCAC ACATCTGAGG TCAGCGACCA GCGCATCCGT 1020 
GTGACCAGTG AAGTTCTCAC TTGCATTAAG CTGATTAAAA TGTACACATG GGAGAAACCA 1080 
TTTGCAAAAA TCATTGAAGG TATGGAAAGT CTGACTTTCT GCTCCAAACC TGGTGATGGC 1140 

75 ATGGOCTTCA GCATGCTGGC CTCCTTGAAT CTXCTTCGGC TGTCAGTGTT CTTTGTGCCT 1200 
ATTGCAGTCA AAGGTCICAC GAATTCCAAG TCTGCAGTGA TGAGGTTCAA GAAGTnTTC 1260 
CTCCAGGAGA GOCCTGTTTT CTATGTCCAG ACATTACAAG ACCCCAGCAA AGCTCTGGTC 1320 
TTTGAGGAGG CCACCTTGTC ATGGCAACAG ACCTGTCCCG GGATCGTCAA TGGGGCACTG 1380 
GAGCTGGAG A GGAACGGGCA TGCTTCTGAG GGGATGACCA GGCCTAGAGA TGCCCTCGGG 1440 
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CCAGAGGAAG AAGGGAACAG CCTGCGCCCA GAGTTGCACA AGATCAACCT GGTGGTGTCC 1500 
AAGGGGATGA TGTTAGGGGT CTGCGGCAAC ACGGGGAGTG GTAAGAGCAG CCTGTTGTCA 1560 
GCCATCCTGG AGGAGATGCA CTTGCTCGAG GGCTCGGTGG GGGTGCAGGG AAGCCTGGCC 1620 
TATGTCCCCC AGCAGGCCTG GATOGTCAGC GGGAACATCA GGGAGAACAT OCTCATGGGA 1680 
5 GGOGCATATG ACAAGGCCCG ATACCTCCAG GTGCTOCACT GCTGCTCOCT G AATCGGGAC 1740 
CTGGAACITC TGCCCTTTGG AGACATGACA GAGATTGGAG AGCGGGGCCT CAACCTCTCT 1800 
GGGGGGCAG A AACAGAGG AT CAGOCTGGGC CGCGCOGTCT ATTCOGACCG TCAG ATCTAC 1860 
CTCCTGGACG ACCCCCTGTC TGCTGTGGAC GCCCACGTGG GGAAGCACAT TTTTGAGGAG 1920 
TGCATTAAGA AGACACTCAG GGGGAAGACG GTCGTCCTGG TGACCCACCA GCTGCAGTAC 1980 

10 TTAGAATTTT GTGGOCAGAT CATTTTGTTG GAAAATGGGA AAATCTGTGA AAATGOAACT 2040 
CACAGTGAGT TAATGCAGAA AAAGGGGAAA TATGCOCAAC TTATCCAGAA GATGCACAAG 2100 
GAAGCCACTT CGGACATGTT GCAGGACACA GCAAAGATAG CAGAGAAGCC AAAGGTAGAA 2160 
AGTCAGGCTC TGGCCACCTC CCTGGAAGAG TCTCTCAACG GAAATGCTGT GCCGGAGCAT 2220 
. _ CAGCTCACAC AGGAGGAGGA GATGGAAGAA GGCTCCTTGA GTTGGAGGGT CTACCACCAC 2280 

15 TACATCCAGG CAGCTGGAGG TTACATGGTC TCTTGCATAA TTTTCTTCTT CGTGGTGCTG 2340 
ATCGTCTTCT TAACGATCTT CAGCTTCTGG TGGCTGAGCT ACTGGTTGGA GCAGGGCTCG 2400 
GGGACCAATA GCAGCCX5AGA GAGCAATGGA ACCATGGCAG ACCTGGGCAA CATTGCAGAC 2460 
AATCCTCAAC TGTCCTTCTA CCAGCTGGTG TAOGGGCTCA ACGOCCTGCT CCTCATCTGT 2520 
GTGGGGGTCT GCTCCTCAGG GATTTTC ACC AAAGTCACGA GGAAGGCATC CACGGCCCTG 2580 

20 CACAACAAGC TCTTCAACAA GGTTTTCCGC TGCCOCATGA GTTTCTTTGA CACCATCCCA 2640 
ATAGGCCGGC TTTTGAACTG CTTCGCAGGG GACTTGGAAC AGCTGGACCA GCTCTTGCCC 2700 
ATCTTTTCAG AGCAGTTCCT GGTCCTGTCC TTAATGGTGA TCX5GOGTCCT GTTGATTGTC 2760 
AGTGTGCTGT CTOCATATAT CCTGTTAATG GGAGCCATAA TCATGGTTAT TTGCTTCATT 2820 
TATTATATGA TGTTCAAGAA GGCCATCGGT GTGTTCAAGA GACTGGAGAA CTATAGCCGG 2880 

25 TCTCCTTTAT TCTCCCACAT CCTCAATTCT CTGCAAGGCC TGAGCTCCAT OCATGTCTAT 2940 

GG AAAAACTG AAGACTTCAT CAGCCAGTTT A AG AGGCTGA CTGATGCGCA G AATAACTAC 3000 
CTGCTGTTGT TTCTATCTTC CACACG ATGG ATGGCATTGA GGCTGGAGAT CATGACCAAC 3060 
CTTGTGACCT TGGCTGTTGC CCTGTTCGTG GCTTTTGGCA TTTCCTCCAC CCCCTACICC 3120 

„ . TTTAAAGTCA TGGCTOTCAA CATCGTGCTG CAGCTGGCGT CCAGCTTCCA GGCCACTG CC 3180 

30 CGGATTGGCT TGGAGACAG A GGCACAGTTC ACGGCTGTAG AGAGGATACT GCAGTACATG 3240 
AAGATGTGTG TCTCGGAAGC TCCTTTACAC ATGGAAGGCA CAAGTTGTCC CCAGGGGTGG 3300 
CCACAGCATG GGGAAATCAT ATTTCAGGAT TATCACATGA AATACAGAGA CAACACACCC 3360 
AOCGTGdTC AOGGCATCAA CCTGACCATC CGCGGCCACG AAGTGGTGGG CATCGTGGGA 3420 
AGG ACGGGCT CTGGGAAGTC CTCCTTGGGC ATGGCTCTCT TCCGCCTGGT GG AGCCCATG 3480 

35 GCAGGCCGGA TTCTCATTGA OGGCGTGGAC ATTTGCAGCA TCGGCCTGGA GGACTTGCGG 3540 
TCCAAGCTCT CAGTGATCCC TCAAGATCCA GTGCTGCTCT CAGGAACCAT CAGATTCAAC 3600 
CTAGATCCCT TTGACOGTCA CACTG ACCAG CAG ATCTGGG ATGCCTTGGA GAGGACATTC 3660 
CTGACCAAGG CCATCTCAAA GTTCCCCAAA AAGCTGCATA CAGATGTGGT GGAAAACGGT 3720 
_ rt GGAAACTTCT CTGTGGGGGA GAGGCAGCTG CTCTGCATTG CCAGGGCTGT GCTTCGCAAC 3780 

40 TOCAAGATCA TCCTTATCGA TGAAGCCACA GCCTCCATTG ACATGGAGAC AGACACCCTG 3840 
ATCCAGCGCA CAATCCGTG A AGCCTTCCAG GGCIGCACCG TGCTCGTCAT TGCCCACCGT 3900 
GTCAOCACIG TGCTGAACTG TG ACCACATC CTGGTTATGG GCAATGGGAA GGTGGTAGAA 3960 
TTTGATCGGC CGGAGGTACT GOGGAAGAAG CCTGGGTCAT TGTTCGCAGC OCTCATGGCC 4020 
ACAGCCACTT CTTCACTGAG ATAAGGAGAT GTGGAGACTT CATGGAGGCT GGCAGCTGAG 4080 

45 CTCAGAGGTT CACACAGGTG CAGCTTCGAG GCCCACAGTC TGCGACCTTC TTGTTTGGAG 4140 
ATGAGAACTT CICCTGGAAG CAGGGGTAAA TGTAGGGGGG GTGGGGATTG CTGGATGGAA 4200 
ACCCTGGAAT AGGCTACTTG ATGGCTCTCA AGACCTTAGA ACCCCAGAAC CATCIAAGAC 4260 
ATGGGATTCA GTGATCATGT GGTTCTCCTT TTAACTTACA TGCTGAATAA TTTTATAATA 4320 
AGGTAAAAGC TTATAGTTTT CTG ATCTGTG TTAGAAGTGY TGCAAATGCT GTACTGACTT 4380 

50 TGTAAAATAT AAAACTAAGG AAAACTCAAA AAAAAAAAAA AAAAAAA 

g W IP HftlB pfp? Protg^n ggqygngg 

Protein Accession*: KP J 15972.1 

55 1 II 21 31 41 51 
I I I I I I 

MTRKRTYWVP NSSGGLVNRG IDIGDDMVSG LXYKTYTLQD GPWSQQERNP EAPGRAA VPP 60 
WGKYDAALRT MiPFRPKPRF PAPQPLDNAG LFSYLTVS WL TPLMIQSLRS RLDENTTPPL 120 
SVHDASDKNV QRLHRLWEEE VSRRGIEKAS VLLVMLRFQR TRLIFDALLG ICFCIAS VLG 180 

60 PDJIPKILE YSEEQLGNW HGVGLCFALF LSECVKSLSF SSSWDNQRT AIRFRAA VSS .240 
FAFEKUQFK S VMTSGEA BFFTGDVNY LFEGVCYGPL VLTTCASLVI CSISSYFUG 300 
YTAFIAILCY LLVFPLAVFM TRMAVKAQHH TSEVSDQRIR VTSEVLTCHC LIKMYTWEKP 360 
FAKIIEGMES LTFCSKPGDG MAFSMLASLN LLRLSVFFVP IAVKGLTNSK SAVMRFKKFF 420 
_ . LQESPVFYVQ TLQDPSKALV FEEATLSWQQ TCPGIVNGAL ELERNGHASE GMTRPRD ALG 480 

65 PEEEGNSLGP ELHKINLWS KGMMLG VCGN TGSGKSSLLS AILEEMHLLE GSVGVQGSLA 540 
YVPQQAWTVS GNIRENILMG GAYDKARYLQ VLHCCSLNRD LELLPFGDMT HGERGLNLS 600 
GGQKQRISLA RAVYSDRQIY LLDDPLS AVD AHVGKHIFEE CKKTLRGKT WLVTHQLQY 660 
LEFCGQIILL ENGKICENGT HSELMQKKGK YAQUQKMHK EATSDMLQDT AKIAEKPKVE 720 
SQALATSLEE SLNGNAVPEH QLTQEEEMFJE GSLSWRVYHH YIQAAGGYMV SOIFFFWL 780 

70 IVFLT1FSFW WLSYWLEQGS GTNSSRESNG TMADLGNIAD NPQLSFYQLV YGLNALLUC 840 
VGVCSSGDFT KVTRKASTAL HNKLFNKVFR CPMSFFDTIP IGRLLNCFAG DLEQUX^LLP 900 
1FSEQFLVLS LMVIAVLUV SVLSPYILLM GAUMVICH YYMMFKKAIG VFKRLENYSR 960 
SPLFSHILNS LQGLSSWVY GKTEDFISQF KRLTDAQNNY LLLFLSSTRW MALRLEIMTN 1020 
LVTLAVALFV AFG1SSTPYS FKVMAVNIVL QLASSFQATA RIGLETEAQF TAVERILQYM 1080 

75 KMCVSBAPLH MEGTSCPQGW PQHGEQFQD YHMKYRDNTP TVLHGINLTI RGHEWGIVG 1140 
RTGSGKSSLG MALFRLVEPM AGRHJDGVD ICS1GLEDLR SKLSVEPQDP VLLSGTIRFN 1200 
LDPFDRHTDQ QIWDALERTF LTKA1SKFPK KLHTDVVENG GNFSVGERQL LOARAVLRN 1260 
SKULIDEAT ASIDMETDTL IQRTIREAFQ GCTVLVLAHR VTTVLNCDHI LVMGNGKWE 1320 
FDRPEVLRKK PGSLFAALMA TATSSLR 
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fiEQlpNWHPHHPHftffBHfW 

Nucleic Add Accession #: AA983251 

Coding sequence: 1-1749 (undefined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

ATQCTGTCTG GCTTCTTGAT GAGTCCCAGT ACCCAGCACA GAGCACAGTA CACTCCCGGA 60 

GGAAAGAAAC TTCCGTGGGA GGCTTCCATC GGTGCGCACA CCTCCCGAGG GCGAG6CA6C 120 

GACCGGGAGA GGGAGAGCCG GCCGGAGGCT GCCGGGCTCC TGTGGGACCG CGCTGCAGCC 180 

GGGGAGGCGG AGAAGGGGAA CCGGGGCGAG CCGCCCGCCT GGATCCGCGC CCAGCAGCAG 240 

CCGCGGCCGC CGCCAGCTGG GCAGGCTCCC GGGACTGCGG CTGGGGGCGC GCAGGACCCT 300 

CGCCTGCGTC CTGGACGTTC CCGGGGGAGG GTCCGGTTGC CAGTGAAACC TCCAGAGGCT 360 

TCCGGACGAC AGCCCCGGGG GCCTTCTGAC TGCATCCCGA GATTTCCATC AGCGAGTGCA 420 

ACTCATAAGG CAGTCCCTAA GGGGACCGGG CCACCGGCTG AGGAOGGGGA TGGCTTAGGA 480 

GCTCCTGGAC CTAGGGCCCG GCGTCGTCGC CTCCTGGGCG TCGOGGCAGA GGGGAGTOGC 540 

CCGCGCGGAA AGCGCCGCGG GACAGTCAGT GACGAGGCCC GGGGGTCGCC GGGGCCACGA 600 

CTTCTCGGAG ACCGTCCTGC GCTCTCTGGA GACGCGCTGT CCGCGCCCAG GGTCGTGCCA 660 

TGTGGGGCGC TCGCCGCTCG TCCGTCTCCT CATOCTGGAA CGCCGCTTCG CTCCTGCAGC 720 

TGCTGCTGGC TGCGCTGCTG GCGGCGGGGG CGAGGGCCCA GCGGCGAGTA CTGCCACGGC 780 

TGGCTGGACG CGCAGGGCGT CTGGCGCATC GGCTTCCAGT GTCCCGAGCG CTTCGACGGC 840 

GGCGACGCCA CCATCTGCTG CGGCAGCTGC GCGTTGCGCT ACTGCTGCTC CAGCGCCGAG 900 

GCGCGCCTGG ACCAGGGCGG CTGCGACAAT GACCGCCAGC AGGGCGCTGG CGAGCCTGGC 960 

CGGGCGGACA AAGACGGGCC CCGACGGCTC GGCAGGGCTT CATGTCTTAG GGGTACCCAA 1020 

GGAGACGGCG AGGGTGCGCC CCCACCCGTG AGGGCCTGGC AGCGGTGCTC CCCTGAAGGC 1080 

TCCCCGAAAG GAAGGCAGCT CCTCAGGGCT TTCCCGGGGC TGCTGCCCCG TGCCAGACGC 1140 

CGCGGATTCC CATCTTCTCC ACGCGGCGGC CCCTCTCCCC TGCAGCGGOC CGCCTTGCCC 1200 

ATCTACGTGC CGTTCCTCAT TGTTGGCTCC GTGTTTGTCG CCTTTATCAT CTTGGGGTCC 1260 

CTGGTGGCAG CCTGTTGCTG CAGATGTCTC CGGCCTAAGC AGGATCCCCA GCAGAGCCGA 1320 

GCCCCAGGGG GTAACCGCTT GATGGAGACC ATOCCCATGA TCCCCAGTGC CAGCACCTCC 1380 

CGGGGGTCGT CCTCACGCCA GTCCAGCACA GCTGCCAGTT CCAGCTCCAG CGCCAACTCC 1440 

GGGGCCCGGG CGCCCCCAAC AAGGTCACAG ACCAACTGTT GCTTGCCGGA AGGGACCATG 1500 

AACAACGTGT ATGTCAACAT GCCCACGAAT TTCTCTGTGC TGAACTGTCA GCAGGCCACC 1560 

CAGATTGTGC CACATCAAGG GCAGTATCTG CATCCCCCAT AGGTGGGGTA CACGGTGCAG 1620 

CACGACTCTG TGCCCATGAC AGCTGTGCCA CCTTTCATGG ACGGCCTGCA GCCTGGCTAC 1680 

AGGCAGATTC AGTCCCCCTT CCCTCACACC AACAGTGAAC AGAAGATGTA CCCAGCGGTG 1740 

ACTGTATAAC CGAGAGTCAC TGGTGGGTTC CTTTACTGAA GGGAGACGAA GGCAGGGGTG 1800 

GATTCTCGAG GTGGAAGTCC GCACATGTCG GTGGTATTTA TGGCACGATT CCTTTGGATG 1860 

GCTTCATTTG CCCCCAGACT GTATGAAAAC ATCTCOGAAT TAGCATTTCT GGATATGTTT 1920 

CATCCAGGGT ATCATTGATT TATGATGGAA AACCGGCCTC AGCTGGAGAT GACTGTGATG 1980 

TTGCTGATGG GTGTATAACA AATGCTTGAG TCCGAAGTGC CCTTGAGATA TGGTTGACGA 2040 

AAGAATTTTA TAAACTGATA AATTAAGGAT TTTTATTATG TTGTTATTAT TATTTCTTTT 2100 

TTGTTGTTGA CTGCACAGGA TCAAAATGCC TGTTATCTCC CTTTTACTGG GACTTTTTTT 2160 

Wm TT T T TTTTTTTTAA TCAGACAGGG TCTTGCTCTG TTGCCCAGGC TGGAGTGCAG 2220 

TGGTGCGATC TCGGCTCACT GCAACTTCAG CCTCCTGGAT TCAGGCAACA CTCCTGOCTC 2280 

AGCCTCCCAC GTGGCTGGGA TTACAGGTGC CTGCCCCCAT GGCTAATTTT TTGTATTTTT 2340 

TGTAGAGATG GGGTTTCAOC ATGTTGGCTG GGCTGGTC T C ACTCTCCTGA CCTCAAGCAA 2400 

TCTGCCTGTC TCAGCCTCCC AAAGTGCTGG GATTACAGGC GTOAGCCACC GCCCCCAGCC 2460 

TGAGCCTTTT TTTTTTTCTA ATGCATCCAA GGTTAAGGGG AAGACGCAAA TAACAGGACT 2S20 

ATTCTAAAAG GAAACCTGTT TGAACTCTGT GAGATCAGTC ATCAGTCTCA GTATTCCACA 2580 

GGCACAOCTT AATTTCATTG TAAAAAGATA TATATATTTT GTCTATTTTT GTGCTTTTGG 2640 

GGGCCTATTT TGTGCTTTTT TACCTTATGT AGAGATCTTA TTACAAAGTG ATTTTCTACA 2700 

TTAAAAAGAG ACTGAAATAA ATTGTATAGT TACTTAACTA ATGAAGACAT TTCAGAACTC 2760 

TGGGATGATT TTAATCTTGA AGTAGTAGGT GGTATAGTCA TAAAACCATT CATCCCCTTC 2820 

•JTCATTGTAT CTTAATTTTC TGGCTTTAAG GTGACATCTG AGAGGTAATG CATTCTTTTT 2880 

TATATTGAAA TCATAAACTA TCACCCGCTG CTTCTCTGAG TTACTTTTAA TTTTGCCTTG 2940 

TGGTTATGGT TTGGCGTTTC CTTCTGTTTG GTTTTCAGAG CCCCATGTCT ATATAGTCCT 3000 

GAGTGCAAGT AATTACTATA CTTGTAAATG AAGATCAGTA TTTCTGCCTA GATCTGATAA 3060 

AAAAATTTTC TTGTCTTAGT TATAAAAATT CAAAGAAATG TGTTACAAAG ATACTTAGTA 3120 

TAGCTCCTCA GCCATAACCT GAGACTTGGG ATGAAATTTA AACCAGATAC GATTTACTTT 3180 

GCAGATCATA AGGCTTTTTA TACTCTTGTT ATCAAAATGG CTTATTTTTC AGGCACTAAG 3240 

GATTGTTAAG AGAAAAGCTT TTCAACGAAG GATTGCCTTT CTTCTCCCAC ACTGTTCTTG 3300 

ATTTCCTCTC TCTTTCAGGC CTCAACAGGC ACTGTATTCA TTGCCAATGT TCCAAATTAT 3360 

CAAATTCAAG TGAATTTATT TGTGTGTTCT TTACTTATAT AAAAAAAGAT AACTTTAAGG 3420 

ATGTGCAAGT ACATTTCCAA CTGCTAGCAC AACCAGTATT TTGTAATTAA ACAAATCGCT 3480 

GTATGGTATG GTCTTCTACA CATTTATGTC TATAGATATC TATCGATCAT CTTTCTATTC 3540 

TGTTTCATGA CTGAATAATG TAAAAOCAGT GTTGGCAATT GGTATCATCA ATGATACTCA 3600 

TTTTTTAATA ACCAAAGGCA GGGGAAAATC ATTTTACTTA TTAATAAATA TTTTATGATG 3660 
TGAAAAAAAA AAAAAAAAAA AAAAAAAAAA 
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Protein Accessions none found 



l u 21 31 41 51 

1 I I I I I 

MLSGFLMSPS TQHRAQYTPG GKKLPWEASI GAHTSRGRGS DRERESRPEA AGLLWDRAAA 60 

GEAEKGNRGE PPAWIRAQQQ PRPPPAGQAP GTAAGGAQDP RLRPGRSRGR VRLPVRPPEA 120 

SGRQPRGPSD CIPRFPSASA THKAVPKGTG PPAE06D6LO APGPRARRRR LLGVAAEGSG 180 

PRGKRRGTVS DEARGSPGPR LLGDRPALSG DALSAPRWP CGALAARPSP HPGTPLRSCS 240 

CCWLRCWRRG RGPSGEYCHG WLDAQGVWRI GFQCPERFDG GDATICCGSC ALRYCCSSAE 300 

ARLDQGGCDN DRQQGAGEPG RADKDGPRRL GRASCLRGTQ GDGEGAPPPV RAWQRCSPEG 360 

SPKGRQLLRA FPGIXPRARR RGFPSSPRGG PSPLQRPALP IYVPFLIVGS VPVAFIILGS 420 

LVAACCCRCL RPKQDPQQSR APGGNRLMET IPHIPSASTS RGSSSRQSST AASSSSSANS 460 

GARAPPTRSC TNCCLPEGTM NNVYVNMPTN FSVLNCQQAT QIVPHQGQYL HPETftfGVTVQ 540 
HDSVPMTAVP PFMDGLQPGY RQIQSPFPHT NSEQKM5TPAV TV 



SEQ tO N&196 C0A5 DNA SEQUENCE 

Nucleic Acid Accession * AA08B458 

Coifing sequence: 862-1995 {underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I ! 1 I I I 

GCCCTTGGAC ACTGACATGG ACTGAAGGAG TAGAATGGAG CACGAGGACA CTGACATGGA 60 

CTGAAGAAAA AGGAGCTGGA GCAGGAGAAG GAGGTGCTGC TGCAGGGTTT GGAGATGATG 120 

GCGCGGGGCC GCGACTGGTA CCAGCAGCAG CTGCAACGAG TGCAGGAGCG CCAGCGCCGC 1B0 

CTGGGCCAGA GCAGAGCCAG CGCCGACTTT GGGGCTGCAG GGAGCCCCCG CCCACTGGGG 240 

CGGCTACTGC CCAAGGTACA AGAGGTGGCC CGGTGCCTGG GGGAGCTGCT GGCTGCAGCC 300 

TGTGCCAGCC GGGCCCTGOC CCCGTCCTCC TCCGGGCCCC CCTGCCCTGC CCTGACGTCC 360 

ACCTCACCCC CGGTCTGGCA GCAGCAGACC ATCCTCATGC TGAAGGAGCA GAACCGACTC 420 

CTCACCCAGG AGGTGACCGA GAAGAGTGAG CGCATCACGC AGCTGGAGCA GGAGAAGTCG 480 

GCGCTCATTA AGCAGCTGTT TGAGGCCCGC GCCCTGAGCC AGCAGGACGG GGGACCTCTG 540 

GATTCCACCT TCATCTAGTC CTTGTGGGCC GCGTGGGCCC CCAGGGCCAG CCTGGCACTC 600 

AGCCCTTCGA GGGTGGGCGC CCCATCGCAC CCACCCTCTC TGGCTGGAGA CCCCCGGCAG 660 

GCCCAGGCAC AGTCCCGGAG TGGGCGCCTT CCTGCCGCCC TTGCCAGATG GGCTCCCCAG 720 

GCCTGOCCCC GGCTGGTCCC CGCACCGAGC GCTTGACTCC GTTTKGGCTC CTGGTTGYTG 780 

ACATGGGCTG GGGGCTCTCT TGAGTCCGCA TAGTCCGCAG CTACTACTGG CCGCTGTCAG 840 

TGGACAGTGG GGTACCCCTC CATGAGTTAG CGTCCCCCCG TTTCCAGCGG TGCCGCCCTG 900 

GGTCOCATCT TCAGGGAAAG GCACTGCCCA CGCCAGGCTG CACTTCCAAC AACGGGCAGC 960 

AGAGGGCGCG GGGCGGCTCC GACGCGGGTC CAAGGGCAGC TTCCCGCTCA ACCAGGGCAC 1020 

CAGGACGAGG TGGCTGTAGC TCGGACGGAC GGAAGTAGAT GGAGGGGGTG GGGACGGCCT 1080 

GTAAGOGGGG GGTGCCTGCC TGGCTGGGGA GCCCCAGGGA TAGCGGTOGG ACTTCAGGTT 1140 

CTGGCCAAGG CTGAGGGAOC CTGGCTGCAG OGGATCGGCA CGCCGGGTGG GCGAGAGCTT 1200 

GGCCTGCATG TGCCTCCCAC AGACCCTGGG GTGATGGCCT TCCCCCTCTT GGCCGGGACG 1260 

TTGCCCCACG TTGAGTCCCA CACAACATCC TGTGAGCCTG GCTCCCCAGG AGGGCCCCCA 1320 

GACAGCTCCC AGGCAOGTCA TAGGCAAAGC CTGTTTCCCC CGACTCAGGA TTTCCAAGGC 1380 

CTGGGGTCCT GCTCACCCCC CTTTGCTCTC ACGCCCAGCC TGTCCCCAGG TTTCAGCTGG 1440 

GAGAGGCCAC CTCCCTCAGC CAAGGAAAAC GAGAAOOCCC AGGGTACAGG AGGAGGCTGG 1500 

GGCAGGTCCC CTTGGGTGTC ACTCCCTCAG CCCCTGOOCA GGCCCACTCC CGCTGGTGCT 1560 

GGAGTACGCA CTGGTGGGGG GGCCCTGCTC AGCCCAA CCT GGAGGGTCCC AGTGTCACCA 1620 

GAACCAGGGG CACGGCAACA GCATCGATGG GTTCTGCAGC CCAGGGCCCC CGATGCGGGG 1680 

TCAGTGTGTG TGGGGOGCAG GGCCTCCGAT GCGGGGTCAG TGCGTGGGGG GCGCAGGGCC 1740 

CCCGATGCGG GGTCAGTGOG TGGGGGGCGC AGGGCCCCCT CGTGTCCAGG GCACTTTGGT 1800 

ACACTGTCCC ACAAGGCACC TGTCTCAGAG GAGGGGCCCT GGCAGGCAGC GTGGCAACTC 1860 

CCTTCCGGAG OCCAGCTCCA TGCTAACCTG CCCACAGCAA CCCCACAGAG CCACATTCCC 1920 

TGCTGCACCT GGTCTGCAGG GGTGTCCCAG GACAGGCCCA AGTCAGCCCA GCATGCAGCT 1980 

GCCCTCCTAC CCTGAAGATG GGAGTGGGCT TTCCAGGGGA CATAAGGATG TCAGGCCTGG 2040 

ACCTCCTGGG CAGGAAAGGG TGCAGGTCCT GAGGGCCTGT GCCCCACAGC CCCAGCACCC 2100 

AGGTGGACTG CAGCGCAGTG GGTGGGCCAG TGGCAGCCAG GGAGAAGCCC CCCGTCAGCA 2160 

GGCTGGGGTC TGCCCACCAG GGCCTCCCCA CGTCTGCCTT TGAGGGTGCC TGCCATGCCC 2220 

TGGGGGATCC TGGCATCTTT ACTGGACTGG AAGCAGGAGA CAGAACAGTG TCTGTCCCGG 2280 

GGTGACTTCA TCAGGAGACC GCCCACATAG AGCTGGACCC CGCAGCTGAA GCGGAAATGT 2340 

GAGACAGGCT GGCACCTCCG GAAAAACTGC CTTTCAGCCT TGGTGTTCCG TGCAAGGTGA 2400 

AAAGAAATAG GTCCTCCCAG TTTACAGCTT GAAATCAGGC TAGTGAGTGG CCCTGGAGAC 2460 

CACGAGGGGA GAATTTAAAG GCCCCGGCTG GCAGGGTCTA GGTGGCTGGC AGAGGCACAT 2520 

GCAGACCCTG CCTGGAGCCT GCCCTAGGAC GCTGGGCGGG TCAGTCTCCG TGCAGGATGT 2580 

GAGCAGCGTC CCTGGGCTCT ATCCGCGAGG TGCCAGTAGC GTGTGCAGGT ACATACACGT 2640 

GCGTGCACAC TGTGATGACA CCCGGAAATG TCTCAGGATG TTGAAATGTG TCCTTGGGGG 2700 

CAGAAGTGTC CCCAGTTGAG AATCTGCCCC AGAGGAACAC ACCCACACCA GGCCTCAGGA 2760 

TTTTGTGTTG ATCAAGTTOC AAGGAAAAGG AACATCTGAG CCGGGCGTGG TGGTTCACGC 2820 

CTGGAATCCC AGCACTTGAG GCCAGGAGTT CCAGAGCAGC CTGGGCAACG CAGTGAGAGA 2880 

CCCCATCTCT ACAARAAAAA AAAAAGAAAG AAAGAAAATG AGAGATCCAG GTTTAAAAAT 2940 

TCATAAACAC CACAAGGAAA CAATACACTA TGAGACCCAG CAGAAGCAAC AGATTGACTC 3000 

TAGACCCAGA TACTAGAATT ATCAGAGAGA ATATAAAGTA ACAGTGTTTT ATATATCTAA 3060 
AGAAATAAAA GAGATTTCTG GAAACATGAA AAAAAA 
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SEQ ID 910:197 LBG2 DMA SEQUENCE 

NucteteAcW Accession* X63529 

Coding sequence: 54-2543 (start and stop codons are underlined) 

l n 21 31 41 51 

I I I I I I 

GCGGAACACC GGCCCGCCGT CGCGGCAGCT GCTTCACCCC TCTCTCTGCA GCCAJ£GGGC 60 
TCCXrrOOTGG ACCTCTCGCG TCTCTCCTCC TTCTCCAGGT TTGCTGGCTO CAGTGCGCGG 120 
CCTOCOAGCC OTOCOGOGCO GTCTTCAGOO AGGCTQAAGT GACCTTGOAO GOGGGAGGCG 180 
CGGAGCAGGA GCCCGGCCAG GCGCTGGGGA AAGTATTCAT GGGCTGCCCT GGGCAAGAGC 240 
CAGCTCTGTT TAGCACTG AT AATG ATGACT TCACTGTGCG G AATGGCG AG ACAGTOCAGG 300 
AAAGAAGGTC ACTGAAGGAA AGGAATCCAT TG AAGATCTT CCCATCCAAA CGTATCTTAC 360 
GAAGACACAA GAGAGATTGG GTGGTTGCIC CAATATCTGT CCCTGAAAAT GGCAAGGGTC 420 
CCTTCCCCCA GAGACTGAAT CAGCTCAAGT CTAATAAAGA TAGAGACACC AAGATTTTCT 480 
ACAGCATCAC GGGGOCGGGG GCAGACAGCC CCCCTGAGGG TGTCT1CGCT GTAGAGAAGG 540 
AGACAGGCTG GTTGTTGTTG AATAAGCCAC TGGAOCGGGA GGAGATTGCC AAGTATGAGC 600 
TCTTTGGCCA CGCTGTGTCA GAGAATGGTG CCTCAGTGGA GGACCCCATG AACATCTCCA 660 
TCATOGTGAC CGACCAGAAT GAOCACAAGC OCAAGTTTAC CCAGGACACC TTCCGAGGGA 720 
GTGTCTTAGA GGGAGTCCTA CCAGGTACTT CTGTGATGCA GGTGACAGCC ACAGATGAGG 780 
ATGATGCCAT CTACACCTAC AATGGGGTGG TTGCTTACTC CATCCATAGC CAAGAACCAA 840 
AGGACCCACA CGAOCTCATG TTCACAATTC AGCGG AGCAC AGGCACCATC AGCGTCATCT 900 
CCAGTGGCCT GGAOCGGGAA AAAGTCCCTG AGTACACACT GACCATCCAG GCCACAG ACA 960 
TGGATGGGGA CGGCTCCACC ACCACGGCAG TGGCAGTAGT GGAGATCCTT GATGOCAATG 1020 
ACAATGCTOC CATGTTTGAC CCCCAGAAGT ACGAGGCCCA TGTGCCTGAG AATGCAGTGG 1080 
GCCATGAGGT GCAGAGGCTG ACGGTCACTG ATCTGG ACGC CCCCAACTCA CCAGCGTGGC 1140 
GTGCCACCTA CCTTATCATG GGOGGTGACG ACGGGGACCA TTTTACCATC ACCACCCACC 1200 
CTGAGAGCAA CCAGGGCATC CTGACAACCA GGAAGGGTTT GGATTTTGAG GOCAAAAACC 1260 
AGCACACOCT GTACGTTGAA GTGACCAACG AGGCCOCTTT TGTGCTGAAG CTCCCAACCT 1320 
OCACAGCCAC CATAGTGGTC CACGTGGAGG ATGTGAATGA GGCAOCTGTG TTTGTOOCAC 1380 
CCTCCAAAGT CGTTGAGGTC CAGGAGGGCA TCCCCACTGG GGAGCCTGTG TGTGTCTACA 1440 
CTGCAGAAGA CCCTGACAAG GAGAATCAAA AGATCAGCTA CCGCATCCTG AGAGACOCAG 1500 
GAGGGTGGCT AGCCATGGAC CCAGACAGTG GGCAGGTCAC AGCTGTGGGC ACCCTCGACC 1560 
GTGAGGATGA GCAGTTTGTG AGGAACAACA TCTATGAAGT CATGGTCTTG GCCATGGACA 1620 
ATGGAAGOCC TCCCACCACT GGCACGGGAA CCCTTCTGCT AACACTGATT GATGTCAACG 1680 
ACCATGGOCC AGTCCCTGAG CCCCGTCAGA TCACCATCTG CAACCAAAGC CCTGTGGGCC 1740 
ACGTGCTGAA CATCACGGAC AAGGACCTGT CTCCCCACAC CTCCCCnTC CAGGCCCAGC 1800 
TCACAGATGA CTCAGACATC TACTGGACGG CAOAGGTCAA CO AGGAAGGT GACACAGTGG 1860 
TCTTGTOCCT GAAGAAGTTC CTGAAGCAGG ATACATATGA CGTGCACCTT TCTCTGTCTG 1920 
ACCATGGCAA CAAAGAGCAG CTGACGGTGA TCAGGGCCAC TGTGTGCGAC TGCCATGGCC 1980 
ATGTCGAAAC CTGCCCTGGA CCCTGGAAAG GAGGTTTCAT CCTOCCTGTG CTGGGGGCTG 2040 
TCCTGGCTCT GCTGTTCCTC CTGCTGGTGC TGCTTTTGTT GGTGAGAAAG AAGCGGAAGA 2100 
TCAAGG AGCC CCTCCTACTC CCAGAAGATG ACACCCGTGA CAACGTCTTC TACTATGGCG 2160 
AAGAGGGGGG TGGCGAAGAG GACCAGGACT ATGACATCAC CCAGCTCCAC OGAGGTCTGG 2220 
AGGCCAGGCC GGAGGTGGTT CTCCGCAATG ACGTGGCAOC AACCATCATC CCGACACCCA 2280 
TGTACCGTCC TAGGCCAGCC AACCCAGATG AAATCGGCAA CTTTATAATT GAG AACCTGA 2340 
AGGCGGCTAA CACAGACCCC ACAGCCCCGC CCTACG ACAC CCTCTTGGTG TTCG ACTATG 2400 
AGGGCAGCGG CTCCGACGCC GCGTCCCTGA GCTCCCTCAC CICCTCCGCC TCCGACCAAG 2460 
ACCAAGATTA CGATTATCTG AACGAGTGGG GCAGCCCCTT CAAGAAGCTG GCAGACATGT 2520 
ACGGTGGCGG GGAGGACGAC TAGGCGGCCT GCCTGCAGGG CTGGGGAOCA AACGTCAGGC 2580 
CACAGAGCAT CTCCAAGGGG TCTCAGTTCC CCCTTCAGCT GAGGACTTCG GAGCTTGTCA 2640 
GGAAGTGGCC GTAGCAACTT GGCGG AGACA GGCTATGAGT CTG ACGTTAG AGTGGTTGCT 2700 
TCCTTAGCCT TTCAGGATGG AGG AATGTGG GCAGTTTGAC TTCAGCACTG AAAACCTCTC 2760 
CACCTGGGCC AGGGTTGCCT CAG AGGCCAA GTTTCCAGAA GCCTCTTACC TGCCGTAAAA 2820 
TGCTCAAOCC TGTGTCCTGG GCCTGGGCCT GCTGTGACTG ACCTACAGTG GACTTTCTCT 2880 
CTGGAATGGA ACCTTCTTAG GCCTCCTGGT GCAACTTAAT TTTTTTTTTT AATGCTATCT 2940 
TCAAAACGTT AGAGAAAGTT CTTCAAAAGT GCAGCCGAG A GCTGCTGGGC CCACTGGCCG 3000 
TCCTGCATTT CTGGTTTCCA GACCCCAATG CCTCCCATTC GGATGGATCT CTGCGTTTTT 3060 
ATACTGAGTG TGCCTAGGTT GCCCCTTATT TTTTATTTTC CCTGTTGCGT TGCTATAGAT 3120 
G AAGGGTG AG GACAATCGTG TATATGTACT AGAACTTTTT TATTAAAGAA A 



?EQIDNQ:198UKPrti<fos 
Protein Accession*: CAA45177 

1 11 21 31 41 51 

MGLPRGPLAS llLojVCWLQ CAASEPCRAV FREAEVTXEA GGAEQEPGQA LGKVFMGCPG 60 
QEPALFSTDN DDFTVRNGET VQERRSLKER NPUGFPSKR ILRRHKRDWV VAPISVPENG 120 
KGPFPQRLNQ LKSNKDRDTK 1FYSITGPGA DSFFEGVFAV EKETGWLLLN KPLDREEIAK 180 
YELFGHAVSE NGASVEDPMN BHVTDQND HKPKFTQDTF RGSVLEGVLP GTSVMQVTAT 240 
DEDDAIYTYN G WAYS1HSQ EPKDPHDLMF TIHRSTGTIS VISSGLDREK VPEYTLTIQA 300 
TDMDGDGSTT TAVAWEILD ANDNAPMFDP QKYEAHVPEN AVGHEVQRLT VTDLDAPNSP 360 
AWRATYUMG GDDGDHFITT THPESNQGIL TTRKGLDFEA KNQHTLY VEV TNEAPFVLKL 420 
PTSTATIWH VEDVNEAPVF VPPSKWEVQ EGIPTGEPVC VYTAEDPDKE NQKISYRILR 480 
DPAGWLAMDP DSGQVTAVGT LDREDEQFVR NNIYEVMVLA MDNGSPPTTG TGTLLLTLID 540 
VNDHGPVPEP RQITICNQSP VRHVLNITDK DLSPHTSPPQ AQLTDDSDIY WTAEVNEEGD 600 
TWLSLKKFL KQDTYDVHLS LSDHGNKEQL TVIRATVCDC HGHVETCPGP WKGGHLPVL 660 
GAVLALLFLL LVLLLLVRKK RKIKEPLLLP EDDTRDNVFY YGEEGGGEED QDYDITQLHR 720 
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GLEARPEWL RNDVAPTIIP TPMYRPRPAN PDEIGNFIIE NLKAANTDPT APPYDTLLVF 780 
DYEGSGSDAA SLSSLTSSAS DQDQDYDYLN BWGSRFKKLA DMYGGGEDD 

SEQ ID NO:199 OBIS DNA SEQUENCE 

Nucleic Acid Accessions: NM.012152 

Coding sequence: 43-11 04 (undertlned sequences correspond to start and stop codons) 



l 11 21 31 41 51 

i I I I I I 

CTTCTTTAAA TTTCTTTCTA GGATGTTCAC TTCTTCTCCA CAATGAATQA GTGTCACTAT 60 

GACAAGCACA TGGACTTTTT TTATAATAGG AGCAACACTG ATACTGTCGA TGACTGGACA 120 

GGAACAAAGC TTGTGATTGT TTTGTGTGTT GGGACGTTTT TCTGCCTGTT TATTTTTTTT 180 

TCTAATTCTC TGGTCATCGC GGCAGTGATC AAAAACAGAA AATTTCATTT CCCCTTCTAC 240 

TACCTGTTGG CTAATTTAGC TGCTGCCGAT TTCTTCGCTG GAATTGCCTA TGTATTCCTG 300 

ATGTTTAACA CAGGCCCAGT TTCAAAAACT TTGACTGTCA ACCGCTGGTT TCTCC6TCAG 360 

GGGCTTCTGG ACAGTAGCTT GACTGCTTCC CTCACCAACT TGCTGGTTAT CGCCGTGGAG 420 

AGGCACATGT CAATCATGAG GATGCGGGTC CATAGCAACC TGACCAAAAA GAGGGTGACA 480 

CTGCTCATTT TGCTTGTCTG GGCCATCCCC ATTTTTATGG GCGCGGTCCC CACACTGGGC 540 

TGGAATTGCC TCTGCAACAT CTCTGCCTGC TCTTCCCTGG CCCCCATTTA CAGCAGGAGT 600 

TACCTTGTTT TCTGGACAGT GTCCAACCTC ATGGCCTTCC TCATCATGGT TGTGGTGTAC 660 

CTGCGGATCT ACGTGTACGT CAAGAGGAAA ACCAACGTCT TGTCTCCGCA TACAAGTGGG 720 

TCCATCAGCC GCCGGAGGAC ACCCATGAAG CTAATGAAGA CGGTGATGAC TGTCTIAGGG 780 

GCGTTTGTGG TATGCTGGAC CCCGGGCCTG GTGGTTCTGC TCCTCGACGG CCTGAACTGC 840 

AGGCAGTGTG GCGTGCAGCA TGTGAAAAGG TGGTTCCTGC TGCTGGCGCT GCTCAACTCC 900 

GTCGTGAACC CCATCATCTA CTCCTACAAG GACGAGGACA TGTATGGCAC CATGAAGAAG 960 

ATGATCTGCT GCTTCTCTCA GGAGAACCCA GAGAGGCGTC CCTCTCGCAT CCCCTCCACA 1020 

GTCCTCAGCA GGAGTGACAC AGGCAGCCAG TACATAGAGG ATAGTATTAG CCAAGGTGCA 1080 

GTCTGCAATA AAAGCACTTC CTAAACTCTG GATGCCTCTC GGCCCACCCA GGTGATGACT 1140 
GTCTTAGG 



SEQ ID NO300 0B1S Protein sequence: 

Protein Accessions: NP_036284 

1 11 21 31 41 51 

I I I 1 I I 

MNECHYDKHM DFFYNRSNTD TVDDWTGTKL VTVLCVGTFF CLPIPPSNSL VTAAVIKNRK 60 

FHFPFYYLLA NLAAADFFAG IAYVFLMFNT GPVSKTLTVK RWFLRQGLLD SSLTASLTNL 120 

LVTAVERHMS IMRMKVHSNL TKKRVTLLIL LVWAIAIPKG AVPTLGWNCL Oil SACS SLA 180 

PIYSRSYLVF WTV5NLMAFL IMWVYLRIY VYVKRKTNVL SPHTSGSISR RRTPMKLMKT 240 

VHTVLGAPW CWTPGLWLL LDGLNCRQCG VQHVKRWFLL LALLNSWNP IIYSYKDEDM 300 
YGTMKKMIOC FSQENPERRP SRIPSTVLSR SDTGSQYIED SISQGAVCNK STS 

SEQ ID NO:201 PAA6 DNA SEQUENCE 

Nucleic Acid Accession #: AA569531 

Coding sequence: 1-504 (underlined sequences correspond to start and stop codons) 

l 11 21 31 41 51 

I I I I I I 

ATGACCTACA GTTACTCATT TTTCAGGCCT GAGTTGATCG TTAATCATCT TAATTATGTT 60 

CATTCTGAAG CCAACAGGAG AACCAAGACC AAAACTTTAT TOTCTCTGCT TTCATTOCTT 120 

GATGAAACCT CTGGACTAAG CACACATCTP CCTTGTTTAT CTCTCTCAAA GGAGTGTGGA 180 

GTGCTTCATC TGGACATCCA CGGGAAGAAG GAAGACATGA GAATCACCCA ACAGTCTTCC 240 

CAGCTATACC TGTGGGACAT GGGTGGTTTT ACAATATTTA AGAACCTGTG GATGAGCCTC 300 

ATACCCAGAG GGAACAAACG CTCCCCAAAA AGAGTTACAG AAACCATCCT GAGAGATTTT 360 

AAGCAGAAGC AAAGTTCAAA GATCCAAGAG GAGAGACGAA GAGAGTCTGC AGGACCAAAC 420 

CTCTCTTCAT TCTGGTTTGT GGGGAATGCT GGAAGAGGAG ACAGGCCCCA GATTTGGGCA 480 

GGAAGTAAAC AGTTTTCAGG CTGAGGCCAA TCTGAGCAGG AACATTCCAA TATTTCTTCA 540 

GCTACGTTGT CCCAGCACTT CACTGGTTAA CCTTTTATGT CCACCATTTG TGGATTTCAC 600 

AGCTACTTGT CAATGGTGAA TATTGATCAT CATCATTATC TACTGAGCTG CTACCATATG 660 

CCAGCTACTC tTTGCATGTT GTTCATTATT TTCTCAACAC TCAGCATATT TGCAATATGT 720 

TATGTAATAT CACAGACAAG GAAACTGAAC GCAGAAATGT TTTATTTCTT GCCAAACATC 780 

ACATGAGGAT GAACAATGAA ACCGATTTGA AACCAGGATT GTCTGATTCC AACATCTCTG 840 
GGTCCTTTTT CACTCTGATA TGCTGCAATT AAAAAGCCAT TTCTAAGACT GT 



SEQ fD NO302 PAA6 Protein sequence: 
Protein Accession #: none found 

1 11 21 31 41 51 

I I I I I I 

HTYSYSFFRP ELIVNHLNYV HSBANRRTKT KTLLSLLSFL DETSGLSTHL PCLSLSKECG 60 
VLHLDIHGKK EDMRITQQSS QLYLWDMGGF TIFKNLVJMSL IPRGNKRSPK RVTETILRDF 120 
KQKQSSK1QE ERRRESAGPN LSSFWFVGNA GRGDRPQIWA GSKQFSG 
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SEQ 10 N03Q3 PAB2 DNA SEQUENCE 

Nideic Add Accession*: XMQ50197 

Coding sequence: 310-1971 (underlined sequences correspond to start and stop codons) 

X 11 21 31 41 51 

. I ! I I I I 

TCACACGTGC CAAGGGGCTG GCTCAGCGGA ACCAGCCTGC ACGCGCTGGC TCCGGGTGAC 60 

AGCCGCGCGC CTCGGCCAGG ATCTGAGTGA TGAGACGTGT CCCCACTGAG GTGCCCCACA 120 

GCAGCAGGTG TTGAGCATGG GCTGAGAAGC TGGACCGGCA CCAAAGGGCT GGCAGAAATG 180 

GGOGCCTGGC TGATTCCTAG GCAGTTGGCG GCAGCAAGGA GGAGAGGCCG CAGCTTCTGG 240 

AGCAGAGCCG AGACGAAGCA GTTCTGGAGT GCCTGAACGG CCCCCTGAGC CCTACCCGCC 300 

TGGCCCACTA TGGTCCAGAG GCTGTGGGTG AGCCGCCTGC TGCGGCACCG GAAAGCCCAG 360 

CTCTTGCTGG TCAACCTGCT AACCTTTGGC CTGGAGGTGT GTTTGGCCGC AGGCATCACC 420 

TATGTGCCGC CTCTGCTGCT GGAAGTGGGG GTAGAGGAGA AGTTCATGAC CATGGTGCTG 480 

GGCATTGGTC CAGTGCTGGG CCTGGTCTGT GTCCCGCTCC TAGGCTCAGC CAGTGACCAC 540 

TGGCGTGGAC GCTATGGCCG CCGCCGGCCC TTCATCTGGG CACTGTCCTT GGGCATCCTG 600 

CTGAGCCTCT TTCTCATCCC AAGGGCCGGC TGGCTAGCAG GG CT GCTGTG CCCGGATCCC 660 

AGGCOCCTGG AGCTGGCACT GCTCATCCTG GGCGTGGGGC TGCTGGACTT CTGTGGCCAG 720 

GTGTGCTTCA CTCCACTGGA GGCCCTGCTC TCTGACCTCT TCCGGGACCC GGACCACTGT 780 

CGCCAGGCCT ACTCTGTCTA TGCCTTCATG ATCAGTCTTG GGGGCTGCCT GGGCTACCTC 840 

CTGCCTGCCA TTGACTGGGA CACCAGTGCC CTGGCCCCCT ACCTGGGCAC CCAGGAGGAG 900 

TGCCTCTTTG GCCTGCTCAC CCTCATCTTC CTCACCTGCG TAGCAGCCAC ACTGCTGGTG 960 

GCTGAGGAGG CAGCGCTGGG CCCCACCGAG CCAGCAGAAG GGCTGTCGGC CCCCTCCTTG 1020 

TCGCOCCACT GCTGTCCATG CCGGGCCCGC TTGGCTTTCC GGAACCTGGG CGCCCTGCTT 1080 

CCOCGGCTGC ACCAGCTGTG CTGCOGCATG CCOCGCACCC TGCGCCGGCT CTTCGTGGCT 1140 

GAGCTGTGCA GCTGGATGGC ACTCATGACC TTCACGCTGT TTTACACGGA TTTCGTGGGC 1200 

GAGGGGCTGT ACCAGGGCGT GCCCAGAGCT GAGCCGGGCA CCGAGGCCCG GAGACACTAT 1260 

GATGAAGGCG TTCGGATGGG CAGCCTGGGG CTGTTCCTGC AGTGCGCCAT CTCCCTGGTC 1320 

TTCTCTCTGG TCATGGACCG GCTGGTGCAG OGATTCGGCA CTCGAGCAGT CTATTTGGCC 1380 

AGTGTGGCAG CTTTCCCTGT GGCTGCCGGT GCCACATGCC TGTCCCACAG TGTGGCCGTG 1440 

GTGACAGCTT CAGCCGCCCT CACCGGGTTC ACCTTCTCAG CCCTGCAGAT CCTGCCCTAC 1500 

ACACTGGCCT CCCTCTACCA CCGGGAGAAG CAGGTGTTCC TGCCCAAATA CCGAGGGGAC 1560 

ACTGGAGGTG CTAGCAGTGA GGACAGCCTG ATGACCAGCT TCCTGCCAGG CCCTAAGCCT 1620 

GGAGCTCCCT TCCCTAATGG ACACGTGGGT GCTGGAGGCA GTGGCCTGCT CCCACCTCCA 1680 

CCCGOGCTCT GCGGGGCCTC TGCCTGTGAT GTCTCCGTAC GTGTGGTGCT GGGTGAGCCC 1740 

ACCGAGGCCA GGGTGGTTCC GGGCCGGGGC ATCTGCCTGG ACCTCGCCAT CCTGGATAGT 1800 

GCCTTCCTGC TGTCCCAGGT GGCCCCATCC CTGTTTATGG GCTCCATTGT OCAGCTCAGC 1860 

CAGTCTGTCA CTGCCTATAT GGTGTCTGCC GCAGGCCTGG GTCTGGTCGC CATTTACTTT 1920 

GCTACACAGG TAGTATTTGA CAAGAGCGAC TTGGCCAAAT ACTCAGC GTA GA AAACTTCC 1980 

AGCACATTGG GGTGGAGGGC CTGCCTCACT GGGTCCCAGC TCCCCGCTCC TGTTAGCCCC 2040 

ATGGGGCTGC CGGGCTGGCC GCCAGTTTCT GTTGCTGCCA AAGTAATGTG GCTCTCTGCT 2100 

GCCACCCTGT GCTGCTGAGG TGCGTAGCTG CACAGCTGGG GGCTGGGGCG TCCCTCTCCT 2160 

CTCTCCCCAG TCTCTAGGGC TGCCTGACTG GAGGCCTTCC AAGGGGGTTT CAGTCTGGAC 2220 

TTATACAGGG AGGCCAGAAG GGCTCCATGC ACTGGAATGC GGGGACTCTG CAGGTGGATT 2280 

ACCCAGGCTC AGGGTTAACA GCTAGCCTCC TAGTK3AGAC ACACCTAGAG AAGGGTTTTT 2340 

GGGAGCTGAA TAAACTCAGT CACCTGGTTT CCCATCTCTA AGCCCCTTRA CCTGCAGCTT 2400 

CGTTTAATGT AGCTCTTGCA TGGGAGTTTC TAGGATGAAA CACTCCTCCA TGGGATTTGA 2460 

ACATATGAAA GTTATTTGTA GGGGAAGAGT GCTGAGGGGC AACACACAAG AACCAGGTCC 2520 

CCTCAGCCCC ACAGGCACTG GTCTTTTTTG CTNGANTCCA CCCCCCCCCT CTTTACCCTT 2580 
TT 



Protein Accession ft: XPJJ501 97 

1 u 21 31 41 51 

I I I I I I 

MVQRLWVSRL LRHRKAQLLL VNLLTFGLEV CLAAGITYVP PLLLEVGVEE KFMTMVLGIG 60 
PVLGLVCVPL LGSASDHWRG RYGRRRPFIW ALSLGILLSL PLIPRAGWIA GLLCPDPRFL 120 
ELALLILGVG LLDFCGQVCF TPLEALLSDL PRDPDHCRQA YSVYAFMISL GGCLGYLLFA 180 
HJWDTSALAP YLGTQEECLF GLLTLIPLTC VAATLLVAEE AALGPTEPAE GLSAPSLSfcH 240 
CCPCRARLAF RNLGALLPRL HQLCCRMPRT LRRLFVAELC SWMALHTPTL FYTDFVGEGL 300 
YQGVPRAEPG TEARRHYDEG VRKGSLGLFL QCAISLVFSL VHDRLVQRFG TRAVYLASVA 360 
AFPVAAGATC LSHSVAWTA SAALTGFTFS ALQILPYTLA SLYHREKQVP LPKYRGDTGG 420 
ASSEDSLMTS FLPGPKPGAP FPNGHVGAGG SGLLPPPPAL CGASACDVSV RVWGEPTEA 480 
RWPGRGICL DLAILPSAFL LSQVAPSLFM GSIVQLSQSV TAYMVSAAGL GLVAIYFATQ 540 
WFDKSDLAK YSA 

SEQ ID N0:205 PAJ3 DNA SEQUENCE 

Nuddc Acid Accession ft AK002126 

Cooing sequence: M 593 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

1 I I I i I 

ATGGTTCGOC GGGGGCTGCT TGCGTGGATT tcccgggtgg TGGTTTTGCT GGTGCTCCTC 60 

TGCTGTGCTA TCTCTGTCCT GTACATGTTG GCCTGCACCC CAAAAGGTGA CGAGGAGCAG 120 

CTGGCACTGC CCAGGGCCAA CAGCCCCAOG GGGAAGGAGG GGTACCAGGC CGTCCTTCAG 180 

GAGTGGGAGG AGCAGCACCG CAACTACGTG AGCAGCCTGA AGCGGCAGAT CGCACAGCTC 240 
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AAGGAGGAGC TGCAGGAGAG GA6TGAGCAG CTCAGGAATG GGCAGTACCA A6CCA6C6AT 300 

GCTGCTGGCC TGGGTCTGGA CAGGAGCCCC CCAGAGAAAA CCCAGGCCGA CCTCCTGGCC 360 

TTCCTGCACT CGCAGGTGGA CAAGGCAGAG GTGAATGCTG GCGTCAAGCT GGCCACAGAG 420 

TATGCAGCAG TGCCTTTCGA TAGCTTTACT CTACAGAAGG TGTACCAGCT GGAGACTGGC 480 

CTTACCCGCC ACCCCGAGGA GAAGCCTGTG AGGAAGGACA AGCGGGATGA GTTGGTGGAA 540 

GCCATTGAAT CAGCCTTGGA GACCCTGAAC AATCCTGCAG AGAACAGCCC CAATCACCGT 600 

CCTTACACGG CCTCTGATTT CATAGAAGGG ATCTACCGAA CAGAAAGGGA CAAAGGGACA 660 

TTGTATGAGC TCACCTTCAA AGGGGACCAC AAACACGAAT TCAAACGGCT CATCTTATTT 720 

CGACCATTCG GCCCCATCAT GAAAGTGAAA AATGAAAAGC TCAACATGGC CAACACGCTT 780 

ATCAATGTTA TCGTGOCTCT AGCAAAAAGG GTGGACAAGT TCCGGCAGTT CATGCAGAAT 640 

TTCAGGGAGA TGTGCATTGA GCAGGATGGG AGAGTCCATC TCACTGTTGT TTACTTTGGG 900 

AAAGAAGAAA TAAATGAAGT CAAAGGAATA CTTGAAAACA CTTCCAAAGC TGCCAACTTC 960 

AGGAACTTTA CCTTCATCCA GCTGAATGGA GAATTTTCTC GGGGAAAGGG ACTTGATGTT 1020 

GGAGCCCGCT TCTGGAAGGG AAGCAACGTC CTTCTCTTTT TCTGTGATGT GGACATCTAC 1080 

TTCACATCTG AATTCCTCAA TACGTGTAGG CTGAATACAC AGCCAGGGAA GAAGGTATTT 1140 

TATCCAGTTC TTTTCAGTCA GTACAATOCT GGCATAATAT ACGGCCACCA TGATGCAGTC 1200 

CCTCCCTTGG AACAGCAGCT GGTCATAAAG AAGGAAACTG GATTTTGGAG AGACTTTGGA 1260 

TTTGGGATGA CGTGTCAGTA TCGGTCAGAC TTCATCAATA TAGGTGGGTT TGATCTGGAC 1320 

ATCAAAGGCT- GGGGCGGAGA GGATGTGCAC CTTTATCGCA AGTATCTCCA CAGCAACCTC 1380 

ATAGTGGTAC GGACGCCTGT GCGAGGACTC TTCCAGCTCT GGCATGAGAA GCGCTGCATG 1440 

GACGAGCTGA CCCCCGAGCA GTACAAGATG TGCATGCAGT CCAAGGCCAT GAACGAGGCA 1500 

TCCCACGGCC AGCTGGGCAT GCTGGTGTTC AGGCACGAGA TAGAGGCTCA CCTTCGCAAA 1560 
CAGAAACAGA AGACAAGTAG CAAAAAAACA TGA 



Protein Accession #: NP.060841 

1 11 21 31 41 51 

I I I 1 I I 

MVRRGLLAWI SKVWLLVLL CCAISVLYML ACTPKGDEEQ IALPRANSPT GKEGYQAVLQ 60 

EWEEQHRNYV SSLKRQIAQL KEELQERSEQ LRNGQYQASD AAGLGLDRSP PEKTQADLLA 120 

PLHSQVDKAE VNAGVKLATB YAAVPPDSFT LQKVYQLETG LTKHPEEKFV RKDKRDELVE 180 

AIESALETLN NPAENSPNHR PYTASDFIEG IYRTERDKGT LYELTPKGDH KHEFKRLILF 240 

RPFGPIMKVK NEKLNMANTL INVTVPLAKR VDKFRQFMQN FREMCIBQDG KVHLTWYFG 300 

KEEINEVKGI LENTSKAAHF RNFTFIQLNG EFSRGKGLDV GARPWKGSNV LLFFCBVDIY 360 

FTSBFLOTCR LNTQPGKKVP YPVLFSQYNP GIZYGHHDAV PPLEQQLVIK KETGFWRDFG 420 

FGHTCQYRSD FZNIGGFDLD XKGWGGEDVB LYRKYLHSNL IWRTPVRGL FHLWHEKRCK 480 
DELTPEQYKM CMQSKAMNEA SHGQLGMLVF RHHEAHLRK QKQKTSSKKT 

SEQ ID NO207 PAJ5 DMA SEQUENCE 

Nucleic Acid Accession*: AF189723 

Cooing sequence: 1-2712 (undertlned seouervces correspond to start and slop codons) 



1 11 21 31 41 51 

I I I I I I 

ATGATTCCTG TATTGACATC AAAAAAAGCA AGTGAATTAC CAGTCAGTGA AGTTGCAAGC 60 

ATTCTCCAAG CTGATCTTCA GAATGGTCTA AACAAATGTG AAGTTAGTCA TAGGCGAGCC 120 

TTTCATGGCT GGAATGAGTT TGATATTAGT GAAGATGAGC CACTGTGGAA GAAGTATATT 180 

TCTCAGTTTA AAAATCCCCT TATTATGCTG CTTCTGGCTT CTGCAGTCAT CAGTGTTTTA 240 

ATGCATCAGT TTGATGATGC CGTCAGTATC ACTGTGGCAA TACTTATCGT TGTTACAGTT 300 

GCCTTTQTTC AGGAATATCG TTCAGAAAAA TCTCTTGAAG AATTGAGTAA ACTTGTGCCA 360 

CCAGAATGCC ATTGTGTGCG TGAAGGAAAA TTGGAGCATA CACTTGCOCG AGACTTGGTT 420 

CCAGGTGATA CAGTTTGCCT TTCTGTTGGG GATAGAGTTC CTGCTGACTT AOGCTTGTTT 480 

GAGGCTGTGG ATCTTTCCAT TGATGAGTCC AGCTTGACAG GTGAGACAAC GCCTTGTTCT 540 

AAGGTGACAG CTCCTCAGCC AGCTGCAACT AATGGAGATC TTGCATCGAG AAGTAACATT 600 

GCCTTTATGG GAACACTGGT CAGATGTGGC AAAGCAAAGG GTGTTGTCAT TGGAACAGGA 660 

GAAAATTCTG AATTTGGGGA GGTTTTTAAA ATGATGCAAG CAGAAGAGGC ACCAAAAACC 720 

CCTCTGCAGA AGAGCATGGA CCTCTTAGGA AAACAACTTT CCTTTTACTC CTTTGGTATA 780 

ATAGGAATCA TCATGTTGGT TGGCTGGTTA CTGGGAAAAG ATATCCTGGA AATGTTTACT 840 

ATTAGTGTAA GTTTGGCTGT AGCAGCAATT CCTGAAGGTC TCCCCATTGT GGTCAGAGTG 900 

ACGCTAGCTC TTGGTGTTAT GAGAATGGTG AAGAAAAGGG CCATTGTGAA AAAGCTGCCT 960 

ATTGTTGAAA CTCTGGGCTG CTGTAATGTG ATTTGTTCAG ATAAAACTGG AACACTGACG 1020 

AAGAATGAAA TGACTGTTAC TCACATATTT ACTTCAGATG GTCTGCATGC TGAGGTTACT 1080 

GGAGTTGGCT ATAATCAATT TGGGGAAGTG ATTGTTGATG GTGATGTTGT TCATGGATTC 1140 

TATAACCCAG CTGTTAGCAG AATTGTTGAG GCGGGCTGTG TGTGCAATGA TGCTGTAATT 1200 

AGAAACAATA CTCTAATGGG GAAGCCAACA GAAGGGGCCT TAATTGCTCT TGGAATGAAG 1260 

ATGGGTCTTG ATGGACTTCA ACAAGACTAC ATCAGAAAAG CTGAATACCC TTTTAGCTCT 1320 

GAGCAAAAGT GGATGGCTGT TAAGTGTGTA CACCGAACAC AGCAGGACAG ACCAGAGATT 1380 

TGTTTTATGA AAGGTGCTTA CGAACAAGTA ATTAAGTACT GTACTACATA CCAGAGCAAA 1440 

GGGCAGACCT TGACACTTAC TCAGCAGCAG AGAGATGTGT ACCAACAAGA GAAGGCACGC 1500 

ATGGGCTCAG CGGGACTCAG AGTTCTTGCT TTGGCTTCTG GTCCTGAACT GGGACAGCTG 1560 

ACATTTCTTG GCTTGGTGGG AATCATTGAT CCACCTAGAA CTGGTGTGAA AGAAGCTGTT 1620 

ACAACACTCA TTGCCTCAGG AGTATCAATA AAAATGATTA CTGGAGATTC ACAGGAGACT 1680 

GCAGTTGCAA TCGCCAGTCG TCTGGGATTG TATTCCAAAA CTTCCCAGTC AGTCTCAGGA 1740 

GAAGAAATAG ATGCAATCGA TGTTCAGCAG CTTTCACAAA TAGTACCAAA GGTTGCAGTA 1800 

TTTTACAGAG CTAGCCCAAG GCACAAGATG AAAATTATTA AGTCGCTACA GAAGAACGGT 1860 

TCAGTTGTAG CCATGACAGG AGATGGAGTA AATGATGCAG TTGCTCTGAA GGCTGCAGAC 1920 
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ATTGGAGTTG CGATGGGCCA GACTGGTACA GATGTTTGCA AAGAGGCAGC AGACATGATC 
CTAGTGGATG ATGATTTTCA AACCATAATG TCTGCAATCG AAGAGGGTAA AGGGATTTAT 
AATAACATTA AAAATTTCGT TAGATTCCAG CTGAGCACGA GTATAGCAGC AT TAACT TTA 
ATCTCATTGG CTACATTAAT GAACTTTCCT AATCCTCTCA ATGCCATGCA GATTTTGTGG 
ATCAATATTA TTATGGATGG ACCCCCAGCT CAGAGCCTTG GAGTAGAACC AGTGGATAAA 
GATGTCATTC OTAAACCTCC TCGCAACTGG AAAGACAGCA TTTTGACTAA AAACTTGATA 
CTTAAAATAC TTGTTTCATC AATAATCATT GTTTGTGGGA CTTTGTTTGT CTT CTGGC GT 
GAGCTACGAG ACAATGTGAT TACACCTCGA GACACAACAA TGACCTTCAC ATGCTTTGTG 
TTTTTTGACA TGTTCAATGC ACTAAGTTCC AGATCCCAGA CCAAGTCTGT GTTTGAGATT 
GGACTCTGCA GTAATAGAAT GTTTTGCTAT GCAGTTCTTG GATCCATCAT GGGAGAATTA 
CTAGTTATTT ACTTTCCTCC GCTTCAGAAG GTTTTTCAGA CTGAGAGCCT AAGCATACTG 
GATCTGTTGT TTCTTTTGGG TCTCACCTCA TCAGTGTGCA TAGTGGCAGA AATTATAAAG 
AAGGTTGAAA GGAGCAGGGA AAAGATCCAG AAGCATGTTA GTTCGACATC ATCATCTTTT 
CTTGAAGT AT GA 

Protein Accession* AAF27613 

1 11 21 31 41 51 

I I I I I I 

MXPVLTSKKA SELPVSEVAS ILQADLQNGL NKCEVSHRRA PHGWNEFDIS EDEPLWKKYI 
SQFKNPLIML LLASAVISVL MHQFDDAVSI TVAILIWTV AFVQEYRSEK SLEELSKLVP 
PECHCVREGK LEHTLARDLV PGDTVCLSVG BRVPADLRLF EAVDLSIDES SLTGETTPCS 
KVTAPQPAAT KGDLASRSNI APMGTLVRCG KAKGWIGTG ENSEFGBVFK MMQAEEAPKT 
PLQKSMDLLG KQLSFYSFGI IGIIMLVGWL LGKDILEHFT ISVSLAVAAI PEGLPIWTV 
TLALGVMRKV KKRAIVKKLP IVETLGCCNV ICSDKTGTLT KKEMTVTKIP TSDGLHAEVT 
GVGYNQFGEV IVDGDWBGF YNPAVSRIVE AGCVCMDAVZ RNTJTLMGKPT EGALIALAMK 
MGLDGLQQDY IRKAEYPFSS EQKWMAVKCV HRTQQDRPEI CFMKGAYEQV IKYCTTYQSK 
GQTLTLTQQQ RDVYQQEKAR MGSAGLRVLA LASGPELGQL TFLGLVGIID PPRTGVKEAV 
TTLIASGVSI KHITODSQET AVAIASRLGL YSKTSQSVSG EBIDAMDVQQ LSQIVPKVAV 
FYBASPRHKM KIIKSLQKNG SWAMTGDGV NDAVALKAAD IGVAMGQTGT DVCKEAAEMI 
LVDDDFQTIM SAIEEGKGIY NNIKNFVRFQ LSTSIAALTL ISLATLMNFP NPLNAMQILW 
INIIMDGPPA QSLGVEPVDK DVIRKPPRNW KDSILTKNLI LKILVSSIII VCGTLFVFWR 
ELRDNVITPR DTTHTFTCFV FFDKPNALSS RSQTKSVPEI GLCSNRMFCY AVLGSIMGQL 
LVTYFPPLQK VFQTESLSIL DLLFLLGLTS SVCIVAEIIK KVERSREKIQ KHVSSTSSSF 
LEV 

SEQ ID N02Q9 PAV4 VARIANT 1 DNA SEQUENCE 

Nuclefc Arid Accession* N62096 

Coding sequence: 1-1 284 (underlined sequences correspond to start and stop codorts) 



1 11 21 31 41 51 

j 1 I I I I 

ATGGGCTACC AGAGGCAGGA GCCTGTCATC CCGCCGCAGA GAGGATTGCC TTATTCAATG , 60 

AAGCAAGCTG GGTTTCCTTT GGGAATATTG CTTTTATTCT GGGTTTCATA TGTTACAGAC 120 

TTTTCCCTPG TTTTATTGAT AAAAGGAGGG GCCCTCTCTG GAACAGATAC CTACCAGTCT 1B0 

TTGGTCAATA AAACTTTCGG CTFTCCAGGG TATCTGCTCC TCTCTGTTCT TCAGTTTTTG 240 

TATCCTTTTA TAGCAATGAT AAGTTACAAT ATAATAGCTG GAGATACTTT GAGCAAAGTT 300 

TTTCAAAGAA TCCCAGGAGT TGATCCTGAA AACGTGTTTA TTGGTCGCCA CTTCATTATT 360 

GGACTTTCCA CAGTTACCTT TACTCTGCCT TTATCCTTGT ACCGAAATAT AGCAAAGCTT 420 

GGAAAGGTCT CCCTCATCTC TACAGGTTTA ACAACTCTGA TTCTTGGAAT TGTAATGGCA 480 

AGGGCAATTT CACTGGGTCC ACACATACCA AAAACAGAAG AGGCTTGGGT ATTTGCAAAG 540 

CCCAATGCCA TTCAAGCGGT CGGGGTTATG TCTTTTGCAT TTATTTGCCA OCATAACTCC 600 

TTCTTAGTTT ACAGTTCTCT AGAAGAACCC ACAGTAGCTA AGTGGTCCCG CCTTATCCAT 660 

ATGTCCATCG TGATTTCTGT ATTTATCTGT ATATTCTTTG CTACATGTGG ATACTTGACA 720 

TTTACTGGCT TCACCCAAGG GGACTTATTT GAAAATTACT GCAGAAATGA TGACCT GGTA 780 

ACATTTGGAA GATTTTGTTA TGGTGTCACT GTCATTTTGA CATACCCTAT GGAATGCTTT 840 

GTGACAAGAG AGGTAATTGC CAATGTGTTT TTTGGTGGGA ATCTTTCATC GGTTTTCCAC 900 

ATTGTTGTAA CAGTGATGGT CATCACTGTA GCCACGCTTG TGTCATTGCT GATTGATTGC 960 

CTOGGGATAG TTCTAGAACT CAATGGTGTG CTCTGTGCAA CTCCCCTCAT TTTTATCATT 1020 

CCATCAGCCT GTTATCTGAA ACTGTCTGAA GAACCAAGGA CACACTCCGA TAAGATTAfG 1080 

TCTTGTGTCA TGCTTCCCAT TGGTGCTGTG GTGATGGTTT TTGGATTCGT CATGGCTATT 1140 

ACAAATACTC AAGACTGCAC CCATGGGCAG GAAATGTTCT ACTGCTTTCC TGACAATTTC 1200 

TCTCTCACAA ATACCTCAGA GTCTCATGTT CAGCAGACAA CACAACTTTC TACTTTAAAT 1260 
ATTAGTATCT TTCAACTCGA GTAA 



seQPHQ:M0PAV4 Variant 1 Protein sequence: 
Protein Accession* none found 

l 11 21 31 41 51 

I I I I I 1 

KGYQRQEPVI PPQRGLPYSM KQAGFPLGIL LLPWSYVTD FSLVLLIKGG ALSGTDTYQS 60 

LVNKTFGFPG YLU.SVLQFL YPFIAMISYN IIAGDTLSKV FQRIPGVDPE NVFIGRHFII 120 

GLSTVTFTLP LSLYRNIAKL GKVSLISTGL TTLILGIVKA RAISLGPHIP KTEDAWVFAK 180 

PNAIGAVGVM SFAFICHHNS FLVYSSLEEP TVAKWSRLIH MSIVISVFIC IFFATCGYLT 240 

FTGFTQGDLF ENYCRNUDLV TFGRFCYGVT VILTYPMECF VTREVIANVF FGGNLSSVFB 300 

IWTVMVITV ATLVSLL1DC LGZVLELNGV LCATPLIFII PSACYLKLSE EPRTHSDKIK 360 

SCVKLPIGAV VMVFGFVMAI TNTQDCTHGQ EMFYCPPDNF SLTNTSESHV QQTTQLSTLN 420 
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ISIPQLE 

SEQIDNO^II PAV4 VARIANT 2 DNA SEQUENCE 

5 Nucleic Add Accession!: N62096 

Coding sequence: 1*1203 (undefined sequences correspond to start and stop codons) 

l 11 21 31 41 51 

10 | i i i II 

ATGGGCTACC AGAGGCAGGA GCCTGTCATC CCGCCGGAGT TTTCCCTTGT TTTATTGATA 60 

AAAGGAGGGG CCCTCTCTGG AACAGATACC TACCAGTCTT TGGTCAATAA AACTTTCGGC 120 

TTTCCAGGGT ATCTGCTCCT CTCTGTTCTT CAGTTTTTGT ATCCTTTTAT AGCAATGATA 180 

AGTTACAATA TAATAGCTGG AGATACTTTG AGCAAAGTTT TTCAAAGAAT CCCAGGAGTT 240 

15 GATCCTGAAA ACGTGTTTAT TGGTCGCCAC TTCATTATTG GACTTTCCAC AGTTACCTTT 300 

ACTCTGCCTT TATCCTTGTA CCGAAATATA GCAAAGCTTG GAAAGGTCTC CCTCATCTCT 360 

ACAGGTTTAA CAACTCTGAT TCTTGGAATT GTAATGGCAA GGGCAATTTC ACTGGGTCCA 420 

CACATACCAA AAACAGAAGA CGCTTGGGTA TTTGCAAAGC CCAAT GCCA T TCAAGCGGTC 480 

GGGGTTATGT CTTTTGCATT TATTTGCCAC CATAACTCCT TCTTAGTTTA CAGTTCTCTA 540 

20 GAAGAACCCA CAGTAGCTAA GTGGTCCOGC CTTATCCATA TGTCCATCGT GATTTCTGTA 600 

TTTATCTGTA TATTCTTTGC TACATGTQGA TACTTGACAT TTACTGGCTT CACCCAAGGG 660 

QACTTATTTG AAAATTACTG CAGAAATGAT GACCTGGTAA CATTTGGAAG ATTTTGTTAT 720 

, GGTGTCACTG TCATTTTGAC ATACCCTATG GAATGCTTTG TGACAAGAGA GGTAATTGCC 780 

AATGTGTTTT TTGGTGGGAA TCTTTCATCG GTTTTCCACA TTGTTGTAAC AGTGATGGTC 840 

25 ATCACTGTAG CCACGCTTGT GTCATTGCTG ATTGATTGCC TCGGGATAGT TCTAGAACTC 900 

AATGGTGTGC TCTGTGCAAC TCCCCTCATT TTTATCATTC CATCAGCCTG TTATCTGAAA 960 

CTGTCTGAAG AACCAAGGAC ACACTCCGAT AAGATTATGT CTTGTGTCAT GCTTCCCATT 1020 

GGTGCTGTGG TGATGGTTTT TGGATTCGTC ATGGCTATTA CAAATACTCA AGACTGCACC 1080 

CATGGGCAGG AAATGTTCTA CTGCTTTCCT GACAATTTCT CTCTCACAAA TACCTCAGAG 1140 

30 TCTCATGTTC AGCAGACAAC ACAACTTTCT ACTTTAAATA TTAGTATCTT TCAACTCGAG 1200 
TAA 

35 Protein Aaessf on ft * * ^none found 

1 11 21 31 41 51 

40 MGYQRQEPVI PPQFSLVLLI KGGALSGTDT YQSLVNKTFG FPGYLLLSVL QPLYPPIAMI 60 

SYNIIAGDTL SKVFQRIPGV DPENVFIGRH FIIGLSTVTF TLPLSLYRNI AKLGKVSLIS 120 

TGLTTLILGI VMARAISLGP HIPKTEDAWV FAKPNAIOAV GVHSFAFICH HNSFLVYSSL 180 

EBPTVAKWSR LIHMSIVISV FICIFFATCG YLTFTGFTQG DLFENYCKND DLVTFGRFCY 240 

GVTVTLTYPM ECFVTREVIA MVFFGGNLSS VFHIWTVMV ITVATLVSLL IDCLGIVLEL 300 

45 KGVtiCATPLI FIIPSACYLK LSEEPRTHSD KIHSCVHLPI GAWHVFGFV MAITNTQDCT 360 
HGQEMFYCFP DNFSLTNTSE SHVQQTTQLS TLNISIFQLE 

SEQ ID N&213 PAV4 VARIANT 3 DNA SEQUENCE 

Nudete Acid Accession ft N62096 
50 Cooing sequence: 1-1 140 (undefined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I 1 

55 ATGGGCTACC AGAGGCAGGA GCCTGTCATC CCGCCGCAGG TCAATAAAAC TTTCGGCTTT 60 

CCAGGGTATC TGCTCCTCTC TGTTCTTCAG TTTTTGTATC CTTTTATAGC AATGATAAGT 120 

TACAATATAA TAGCTGGAGA TACTTTGAGC AAAGTTTTTC AAAGAATCCC AGGAGTTGAT 180 

CCTGAAAACG TGTTTATTGG TCGCCACTTC ATTATTGGAC TTTCCACAGT TACCTTTACT 240 

- CTGCCTTTAT CCTTGTACCG AAATATAGCA AAGCTTGGAA AGGTCTCCCT CATCTCTACA 300 

60 GGTTTAACAA CTCTGATTCT TGGAATTGTA ATGGCAAGGG CAATTTCACT GGGTCCACAC 360 

ATACCAAAAA CAGAAGACGC TTGGGTATTT GCAAAGCCCA ATGCCATTCA AGCGGTCGGG 420 

GTTATGTCTT TTGCATTTAT TTGCCACCAT AACTCCTTCT TAGTTTACAG TTCTCTAGAA 480 

GAACCCACAG TAGCTAAGTG GTCCCGCCTT ATCCATATGT CCATCGTGAT TTCTGTATTT 540 

_ ATCTGTATAT TCTTTGCTAC ATGTGGATAC TTGACATTTA CTGGCTTCAC CCAAGGGGAC 600 

65 TTATTTGAAA ATTACTGCAG AAATGATGAC CTGGTAACAT TTGGAAGATT TTGTTATGGT 660 

GTCACTGTCA TTTTGACATA CCCTATGGAA TGCTTTGTGA CAAGAGAGGT AATTGCCAAT 720 

GTGTTTTTTG GTGGGAATCT TTCATCGGTT TTCCACATTG TTGTAACAGT GATGGTCATC 780 

ACTGTAGCCA CGCTTGTGTC ATTGCTGATT GATTGCCTCG GGATAGTTCT AGAACTCAAT 840 

GGTGTGCTCT GTGCAACTCC CCTCATTTTT ATCATTCCAT CAGCCTGTTA TCTGAAACTG 900 

70 TCTGAAGAAC CAAGGACACA CTCCGATAAG ATTATGTCTT GTGTCATGCT TCCCATTGGT 960 

GCTGTGGTGA TGGTTTTTGG ATTCGTCATG GCTATTACAA ATACTCAAGA CTGCACCCAT 1020 

GGGCAGGAAA TGTTCTACTG CTTTCCTGAC AATTTCTCTC TCACAAATAC CTCAGAGTCT 1080 
CATGTTCAGC AGACAACACA ACTTTCTACT TTAAATATTA GTATCTTTCA ACTCGAGTAA 
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SEP ID HOS14 PAV4 Vartanl 3 Protein SfflBKS 
Protein Accession I: nomlound 



1 11 21 31 41 SI 

80 | I I I I I 
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MGYQRQEPVX PPQVNKTFGF PQYLLLSVLQ PLYPFIAKIS YNIIAGDTLS KVFQRIPGVD 60 

PENVFIGRHF IIGLSTVTFT LPLSLYRNIA KLGKVSL1ST GLTTLILGIV MARAISLGPH 120 

IPKTEDAWVF AKPNAIQAVG VMSFAFICHH NSFLVYSSLE EPTVAKWSRL ZEHSZVISVF 180 

ICIFFATCGY LTFTGFTQGD LFENYCRNDD LVTFGRFCYG VWILTYPME CFVTREVIAN 240 

VFFGGNLSSV FHIWTVMVI TVATLVSLLI DCLGIVLBLN GVLCATPLIF IIPSACYLKL 300 

SEEPRTHSDK IMSCVMLPIG AWMVFGFVM AITNTQDCTH GQEMFYCFPD NFSLTNTSES 360 
HVQQTTQLST LNISIFQLE 



8EQ ID N0215 PAV4 VARIANT 4 DNA SEQUENCE: 

Nucleic Add Accession * N82096 

Coding sequence: 1-1389 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

ATGGGCTACC AGAGGCAGGA GCCTCTCATC CCGCCGCAGA GAGATTTAGA TGACAGAGAA . 60 

ACCCTTGTTT CTGAACATGA GTATAAAGAG AAAACCTGTC AGTCTGCTGC TCTTTTTAAT 120 

GTTGTCAACT CGATTATAGG ATCTGGTATA ATAGGATTGC CTTATTCAAT GAAGCAAGCT 180 

GGGTTTCCTT TGGGAATATT GCTTTTATTC TGGGTTTCAT ATGTTACAGA CTTTTCCCTT 240 

GTTTTATTGA TAAAAGGAGG GGCCCTCTCT GGAACAGATA CCTACCAGTC TTTGGTCAAT 300 

AAAACTTTCG GCTTTCCAGG GTATCTGCTC CTCTCTGTTC TTCAGTTTTT GTATCCTTTT 360 

ATAGCAATGA TAAGTTACAA TATAATAGCT GGAGATACTT TGAGCAAAGT TTTTCAAAGA 420 

ATCCCAGGAG TTGATCCTGA AAACGTGTTT ATTGGTCGCC ACTTCATTAT TGGACTTTCC 480 

ACAGTTACCT TTACTCTGCC TTTATCCTTG TACCGAAATA TAGCAAAGCT TGGAAAGGTC 540 

TCCCTCATCT CTACAGGTTT AACAACTCTG ATTCTTGGAA TTGTAATGGC AAGGGCAATT 600 

TCACTGGGTC CACACATACC AAAAACAGAA GACGCTTGGG TATTTGCAAA GCCCAATGCC 660 

ATTCAAGCGG TCGGGGTTAT GTCTTTTGCA TTTATTTGCC ACCATAACTC CTTCTTAGTT 720 

TACAGTTCTC TAGAAGAACC CACAGTAGCT AAGTGGTCCC GCCTTATCCA TATGTCCATC 780 

GTGATTTCTG TATTTATCTG TATATTCTTT GCTACATGTG GATACTTGAC ATTTACTGGC 840 

TTCACCCAAG GGGACTTATT TGAAAATTAC TGCAGAAATG ATGACCTGGT AACATTTGGA 900 

AGATTTTGTT ATGGTGTCAC TGTCATTTTG ACATACCCTA TGGAATGCTT TGTGACAAGA 960 

GAGGTAATTG CGAATGTGTT TTTTGGTGGG AATCTTTCAT CGGTTTTCCA CATTGTTGTA 1020 

ACAGTGATGG TCATCACTGT AGCCACGCTT GTGTCATTGC TGATTGATTG CCTCGGGATA 1080 

GTTCTAGAAC TCAATGGTGT GCTCTGTGCA ACTCCCCTCA TTTTTATCAT TCCATCAGCC 1140 

TGTTATCTGA AACTGTCTGA AGAACCAAGG ACACACTCCG ATAAGATTAT GTCTTGTGTC 1200 

ATGCTTCCCA TTGGTGCTGT GGTGATGGTT TTTGGATTCG TCATGGCTAT TACAAATACT 1260 

CAAGACTGCA CCCATGGGCA GGAAATGTTC TACTGCTTTC CTGACAATTT CTCTCTCACA 1320 

AATACCTCAG AGTCTCATGT TCAGCAGACA ACACAACTTT CTACTTTAAA TATTAGTATC 1380 
TTTCAATGA 



SEQ1DNM16PAV4Var 
Protein Accession #: none found 



l u 21 31 41 51 

! I I I I I 

KGYQRQEPVT PPQRDLDDRE TLVSEHEYKE KTCQSAALFN WNSIIGSGI IGLPYSMKQA 60 

GFPLGILLLF WVSYVTDFSL VLLIKGGALS GTDTYQSLVN KTFGFPGYLL LSVLQFLYPF 120 

IAMISVNIIA GDTLSKVFQR IPGVDPENVF IGRBFZIGLS TVTFTLPLSL YRNIAKLGKV 180 

SL1STGLTTL ILGIVMARAI SLGPHIPKTE DAWVFAKFNA ICAVGVMSFA FICHHNSFLV 240 

YSSLEEPTVA KWSRLIHMSI VISVFICIFF ATCGYLTFTG FTQGDLFENY CRNDDLVTFG 300 

RFCYGVTVIL TYPMECFVTR EVIANVFFGG NLSSVFHTW TVMVITVATL VSLLIDCLGI 360 

VLELNGVLCA TPLIFIIPSA CYLKLSEEPR THSDK1MSCV MLPIGAWKV FGFVHAITMT 420 
QDCTHGQEMF YCFPDNFSLT NTSESHVQQT TQLSTLNISI FQ 

SEQ ID N0217 PAV9 DNA SEQUENCE 

Nucleic Add Accession ft NM.017636 

Coding sequence; 1 -3501 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

1 I I I I I 

ATGGAGGATQ CCTTCGGGGC AGCCGTGGTG ACCGTGTGGG ACAGCGATGC ACACACCACG 60 

GAGAAGCCCA CCGATGCCTA CGGAGAGCTG GACTTCACGG GGGCCGGCCG CAAGCACAGC 120 

AATTTCCTCC GGCTCTCTGA CCGAACGGAT CCAGCTGCAG TTTATAGTCT GGTCACACGC 180 

ACATGGGGCT TCCGTGCCCC GAACCTGGTG GTGTCAGTGC TGGGGGGATC GGGGGGCCCC 240 

GTCCTCCAGA CCTGGCTGCA GGACCTGCTG CGTCGTGGGC TGGTGCGGGC TGCCCAGAGC 300 

ACAGGAGCCT GGATTGTCAC TGGGGGTCTG CACACGGGCA TCGGCCGGCA TGTTGGTGTG 360 

GCTGTAOGGG ACCATCAGAT GGCCAGCACT GGGGGCACCA AGGTGGTGGC CATGGGTGTG 420 

GCCCCCTGGG GTQTGGTCCG GAATAGAGAC ACCCTCATCA ACCCCAAGGG CTCGTTCCCT 480 

GCGAGGTACC GGTGGCGCGG TGACCCGGAG GACGGGGTCC AGTTTCCCCT GGACTACAAC 540 

TACTCGGCCT TCTTCCTGGT GGACGACGGC ACACACGGCT GCCTGGGGGG CGAGAACCGC 600 

TTCCGCTTGC GCCTGGAGTC CTACATCTCA CAGCAGAAGA CGGGCGTGGG AGGGACTGGA 660 

ATTGACATCC CTGTCCTGCT CCTCCTGATT GATGGTGATG AGAAGATGTT GACGCGAATA 720 

GAGAACGCCA CCCAGGCTCA GCTCCCATGT CTCCTCGTGG CTGGCTCAGG GGGAGCTGCG 780 

GACTGCCTGG CGGAGACOCT GGAAGACACT CTGGCCCCAG GGAGTGGGGG AGCCAGGCAA 840 

GGCGAAGCCC GAGATCGAAT CAGGCGTTTC TTTCCCAAAG GGGACCTTGA GGTCCTGCAG 900 

GCCCAGGTGG AGAGGATTAT GACOCGGAAG GAGCTCCTGA CAGTCTATTC TTCTGAGGAT 960 
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GGGTCTGAGG AATTCGAGAC CATAGTTTTG AAGGCCCTTG TGAAGGCCTG T6GGAGCTCG 1020 

GAGGCCTCAG CCTACCTGGA TGAGCTGCGT TTGGCTGTGG CTT6GAACC6 CGTGGACATT 1080 

GCCCAGAGTG AACTCTTTCG GGGGGACATC CAATGGCGGT CCTTCCATCT CGAAGCTTCC 1140 

CTCATGGACG CCCTGCTGAA TGAOCGGCCT OAGTTCGTGC GCTTGCTCAT TTCCCACGGC 1200 

CTCAGCCTGG GCCACTTCCT 0ACCCC6ATQ CGCCTGQCCC AACTCTACAG CGCGGCGCCC 1260 

TCCAACTCGC TCATCCGCAA CCTTTTGGAC CAGGCGTCCC ACAGCGCAGG CACCAAAGCC 1320 

CCAGCCCTAA AAGGGGGAGC TGCGGAGCTC CGGCCCCCT6 ACGTGGGGCA TGTGCTGAGG 1380 

ATGCTGCTGG GGAAGATGTG CGCGCCGAGG TACOCCTOCG GGGGCGCCTG GGACCCTCAC 1440 

CCAGGCCAGG GCTTCGGGGA GAjGCATGTAT CTGCTCTCGG ACAAGGCCAC CTCGCCGCTC 1500 

TC6CTGGATG CTGGCCTCGG GCAGGCCCCC TGGAGCGACC TGCTTCTTTG GGCACTGTTG 1560 

CTGAACA6GG CACAGATGGC CATGTACTTC TGGGAGATGG GTTCCAATGC AGTTTCCTCA 1620 

GCTCTTGGGG CCTGTTTGCT GCTCCGGGTG ATGGCACGCC TGGAGCCTGA CGCTGAGGAG 1680 

GCAGCACGGA GGAAAGACCT GGCGTTCAAG TTTGAGGGGA TGGGCGTTGA CCTCTTTGGC 1740 

GAGTGCTATC GCAGCAGTGA GGTGAGGGCT GCCCGCCTCC TCCTCCGTCG CTGCCCGCTC 1800 

TGGGGGGATG CCACTTGCCT CCAGCTGGCC ATGCAAGCTG ACGCCCGTGC CTTCTTTGCC 1860 

CAGGATGGGG TACAGTCTCT GCTGACACAG AAGTGGTGGG GAGATATGGC CAGCACTACA 1920 

CCCATCTGGG CCCTGGTTCT CGCCTTCTTT TGCCCTOCAC TCATCTACAC CCGCCTCATC 1980 

ACCTTCAGGA AATCAGAAGA GGAGOCCACA CGGGAGGAGC TAGAGTTTGA CATGGATAGT 2040 

OTCATTAATG GGGAAGGGCC TGTCGGGACG GCGGACCCAG CCGAGAAGAC GCCGCTGGGG 2100 

GTCCCGCGCC AGTCGGGCCG TCCGGGTTGC TGCGGGGGCC GCTGCGGGGG GCGCCGGTGC 2160 

CTACGCCGCT GGTTCCACTT CTGGGGCGCG CCGGTGACCA TCTTCATGGG CAACGTGGTC 2220 

AGCTACCTGC TGTTCCTGCT GCTTTTCTCG CGGGTGCTGC TCGTGGATTT CCAGCCGGCG 2280 

CCGCCCGGCT CCCTGGAGCT GCTGCTC TA T TTCTGGGCTT TCACGCTGCT GTGCGAGGAA 2340 

CTCCGCCAGG GCCTGAGCGG AGGCGGGGGC AGCCTCGCCA GCGGGGGCCC CGGGCCTGGC 2400 

CATGCCTCAC TGAGCCAGCG CCTGCGCCTC TACCTCGCCG ACAGCTGGAA CCAGTGCGAC 2460 

CTAQTGGCTC TCACCTGCTT CCTCCTGGGC GTGGGCTGCC GGCTGACCCC GGGTTTGTAC 2520 

CACCTGGGCC GCACTGTCCT CTGCATCGAC TTCATGGTTT TCACGGTGCG GCTGCTTCAC 2580 

ATCTTCACGG TCAACAAACA GCTGGGGCCC AAGATCGTCA TCGTGAGCAA GATGATGAAG 2640 

GACGTGTTCT TCTTCCTCTT CTTCCTCGGC GTGTGGCTGG TAGCCTATGG CGTGGCCACG 2700 

GAGGGGCTCC TGAGGCCACG GGACAGTGAC TTCCCAAGTA TCCTGCGCCG CGTCTTCTAC 2760 

CGTCCCTACC TGCAGATCTT CGGGCAGATT OCCCAGGAGG ACATGGACGT GGCCCTCATG 2820 

GAGCACAGCA ACTGCTCGTC GGAGCCCGGC TTCTGGGCAC ACCCTCCTGG GGCCCAGGCG 2880 

GGCACCTGCG TCTCCCAGTA TGCCAACTGG CTGGTGGTGC TGCTCCTCGT CATCTTCCTG 2940 

CTCGTGGCCA ACATCCTGCT GGTCAACTTG CTCATTGCCA TGTTCAGTTA CACATTCGGC 3000 

AAAGTACAGG GCAACAGCGA TCTCTACTGG AAGGCGCAGC GTTACCGCCT CATCCGGGAA 3060 

TTCCACTCTC GGCCCGCGCT GGCCCCGCCC TTTATCGTCA TCTCCCACTT GCGCCTCCTG 3120 

CTCAGGCAAT TGTGCAGGCG ACCCCGGAGC CCCCAGCCGT CCTCCCCGGC CCTCGAGCAT 3180 

TTCOGGGTTT ACCTTTCTAA GGAAGCCGAG CGGAAGCTGC TAACGTGGGA ATCGGTGCAT 3240 

AAGGAGAACT TTCTGCTGGC ACGCGCTAGG GACAAGCGGG AGAGCGACTC CGAGCGTCTG 3300 

AAGCGCACGT CCCAGAAGGT GGACTTGGCA CTGAAACAGC TGGGACACAT CCGCGAGTAC 3360 

GAACAGCGCC TGAAAGTGCT GGAGCGGGAG GTCCAGCAGT GTAGCCGCGT CCTGGGGTGG 3420 

GTGGCCGAGG CCCTGAGCCG CTCTGCCTTG CTGCCCCCAG GTGGGCOGCC ACCCCCTGAC 3480 
CTGCCTGGGT CCAAAGA CTG A 



SEP ID MQ318PAV9 Protein seouence: 

Protein Accession #: rums found 

1 11 21 31 41 51 

I I I I I I 

HEDAFGAAW TVWDSDAHTT EKPTDAYGEL DFTGAGRKHS NFLRLSDRTD PAAVYSLVTR 60 

1WGFRAPNLV VSVLGGSGGP VLQTWLQDLL RRGLVRAAQS TGAWIVTGGL BTGIGRHVGV 120 

AVRDHQMAST GGTKWAHGV APWGWRNRD TLINPKGSFP ARYRWRGDPE DGVQPPLDYN 180 

YSAFFLVDDG THGCLGGENR FRLRLESYIS QQKTGVGGTG IDIPVLLLLI DGDEKMLTRI 240 

EHATQAQLPC LLVAGSGGAA DCLAETLEDT LAPGSGGARQ GBARDRIRRP FPKGDLEVLQ 300 

aqverhttrk ELLTVYSSED GSEEFETIVL KALVKACGSS EASAYLDELR LAVAWNRVDI 360 

AQSELFRGDZ QWRSFHLEAS LMDALLNDRP EFVRLLISHG LSLGHPLTPH RLAQLYSAAP 420 

SNSLIRNLLD QASHSAGTKA PALKGGAAEL RPPDVGHVLR MLLGKMCAPR YPSGGAWDPH 480 

PGQGFGESMY LLSDKATSPL SLDAGLGQAP WSDLLLWALL LNRAQMAMYF WEMGSNAVSS 540 

ALGACLLLRV MARLEPDAEE AARRKDLAPK FEGMGVDLFG ECYRSSEVRA ARLLLRRCPL 600 

WGDATCLQLA MQADARAFFA QDGVQSLLTQ KWWGDMASTT PIWALVLAFP CPPLIYTOLT 660 

TPRKSEEEPT REELEFDMDS VINGEGFVGT ADPAEKTPLG VPRQSGRPGC CGGRCGGRRC 720 

LRRWFHFWGA FVTTFMGNW SYLLFLLLFS BVLLVDFQPA PPGSLELLLY FWAFTLLCEE 780 

LRQGL5GGGG SLASGGPGPG HASLSQRLRL YLADSMNQCD LVALTCFLLG VGCRLTPGLY 840 

HLGRTVLCID FMVFTVRLLH IFTVNKQLGP KIVIVSKMMK DVFFPLFFLG VWLVAYGYAT 900 

EGLLRPRDSD FPSILRRVFY RFYLQIFGQI PQEDMDVALM EHSNCSSEPG FWAHPPGAQA 960 

GTCVSQYANW LWLLLVIFL LVANILLVNL LIAMFSYTFG KVQGNSDLYW KAQRYRLIRE 1020 

FHSRPALAPP FIVISHLRLL LRQLCRRPRS PQPSSPALEH FRVYLSKEAE RKLLTWESVH 1080 

KENFLLARAR DKRESDSERL KRTSQKVDLA LKQLGHIREY EQRLKVLERE VQQC5RVLGW 1140 
VAEALSRSAL LPPGGPPPPD LPGSKD 

SEQ ID N0219 P6F1 DMA SEQUENCE 

Nucleic Add Accession t; AA054237 

Coding sequence: 1 -894 {underlined sequences correspond to start and stop codons) 

l 11 21 31 41 51 

I I I I I I 

ATGGAGCCGC GGGCGCTCGT CACGGCGCTC AGCCTCGGCC TCAGCCTGTG CTCCCTGGGG 60 

CTGCTCGTCA CGGCCATCTT CACCGACCAC TGGTACGAGA COGACCCCCG GCGCCACAAG 120 

GAGAGCTGCG AGCGCAGCCG CGCGGGCGCC GACCCCCCGG ACCAGAAGAA CCGCCTGATG 180 
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CCGCTGTCGC ACCTGCOGCT GCGGGACTCG CCCCCGCTGG GGCGCCGGCT GCTCOOGGGC 240 

G6CCCGGGGC GCGCCGACCC CGAGTCCTGG CGCTCGCTCC TGGGGCTCGG CGGGCTGGAC 300 

GCCGAGTGCG GCCGGCCCCT CTTCGCCACC TACTCGGGCC TCTGGAGGAA GTGCTACTTC 360 

CTGGGCATCG ACCGGGACAT CGACACCCTC ATCCTGAAAG GTATTGCGCA GCGATGCACG 420 

5 GCCATCAAGT ACCACTTTTC TCAGCCCATC CGCTTGCGAA ACATTCCTTT TAATTTAACC 480 

AAGACCATAC AGCAAGATGA GTGGCACCTG CTTCATTTAA GAAGAATCAC TGCTGGCTTC 540 

CTCGGCATGG CCGTAGCCGT CCTTCTCTGC GGCTGCATTG TGGCCACAGT CAGTTTCTTC 600 

TGGGAGGAGA GCTTGACCCA GCACGTGGCT GGACTCCTGT TCCTCATGAC AGGGATATTT 660 

TGCACCATTT CCCTCTGTAC TTATGCCGCC AGTATCTCGT ATGATTTGAA OCGGCTCCCA 720 

10 AAGCTAATTT ATAGCCTGCC TGCTGATGTG GAACATGGTT ACAGCTGGTC CATCTTTTGC 780 

GCCTGGTGCA GTTTAGGCTT TATTGTGGCA GCTGGAGGTC TCTGCATCGC TTATCCGTTT 840 
ATTAGCCGGA CCAAGATTGC ACAGCTAAAG TCTGGCAGAG ACTCCACGGT ATGA 

15 SEQIDN0320PBF1 Protein sequence: 

Protein Accession*: none found 

1 11 21 31 41 51 

in I- I I I I I 

ZU MEPRALVTAL SLGLSLCSLG LLVTAIPTDH WYETDPRRHK ESCERSRAGA DPPDQKNRLM 60 

PLSHLPLRDS PPLGRRLLPG GPGRADPESW RSLLGLGGLD AECGRPLFAT YSGLWRKCYF 120 

LGIDRDIDTL ILKGIAQRCT AIKYHFSQPI RLRNIPFKLT KTIQQDEWHL LHLRRITAGF 180 

LGMAVAVLLC GCIVATVSFP WEESLTQHVA GLLFLMTGIF CTISLCTYAA SISYDLNRLP 240 
KLIYSLPADV EHGYSW5IFC AWCSLGPIVA AGGLCIAYPF ISRTKIAQLK SGRDSTV 



25 
30 



SEQ 10 KO:221 PCM DNA SEQUENCE 

Nucleic Acid Accession * NM.016570 

Coding sequence 1- 1134 {underfined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

1 I 1 1 1 1 

ATGA GGCGAC TGAATCGGAA AAAAACTTTA AGTTTGGTAA AAGAGTTGGA TGCCTTTCCG 60 

35 AAGGTTCCTG AGAGCTATGT AGAGACTTCA GCCAGTGGAG GTACAGTTTC TCTAATAGCA 120 

TTTACAACTA TGGCTTTATT AACCATAATG GAATTCTCAG TATATCAAGA TACATGGATG 180 

AAGTATGAAT ACGAAGTAGA CAAGGATTTT TCTAGCAAAT TAAGAATTAA TATAGATATT 240 

ACTGTTGCCA TGAAGTGTCA ATATGTTGGA GCGGATGTAT TGGATTTAGC AGAAACAATG 300 

GTTGCATCTG CAGATGGTTT AGTTTATGAA CCAACAGTAT TTGATCTTTC ACCACAGCAG 360 

40 AAAGAGTGGC AGAGGATGCT GCAGCTGATT CAGAGTAGGC TACAAGAAGA GCATTCACTT 420 

CAAGATGTGA TATTTAAAAG TGCTTTTAAA AGTACATCAA CAGCTCTTCC ACCAAGAGAA 480 

GATGATTCAT CACAGTCTCC AAATGCATGC AGAATTCATG GCCATCTATA TGTCAATAAA 540 

GTAGCAGGGA ATTTTCACAT AACAGTGGGC AAGGCAATTC CACATCCTCG TGGTCATGCA 600 

CATTTGGCAG CAC T TGTCAA CCATGAATCT TACAATTTTT CTCATAGAAT AOATCATTTG 660 

45 TCTTTTO GAG AGCTTGTTCC AGCAATTATT AATCCTTTAG ATGGAACTGA AAAAATTCCT 720 

ATAGATCACA ACCAGATGTT CCAATATTTT ATTACAGTTG TGCCAACAAA ACTACATACA 780 

TATAAAATAT CAGCAGACAC CCATCAGTTT TCTGTGACAG AAAGGGAACG TATCATTAAC 840 

CATGCTGCAG GCAGCCATGG AGTCTCTGGG ATATTTATGA AATATGATCT CAGTTCTCTT 900 

ATGGTGACAG TTACTGAGGA GCACATGCCA TTCTGGCAGT TTTTTGTAAG ACTCTGTGGT 960 

50 ATTGTTCGAG GAATCTTTTC AACAACAGGC ATGTTACATG GAATTGGAAA ATTTATAGTT 1020 

GAAATAATTT GCTGTCGTTT CAGACTTGGA TCCTATAAAC CTGTCAATTC TGTTCCTTTT 1080 
GAGGATGGCC ACACAGACAA CCACTTACCT CTTTTAGAAA ATAATACACA TTGA 

55 SEQ ID NO:222 PCI4 Protein secuence: 

Protein Accession #: NP.057654 

1 11 21 31 41 51 

An I 1 1 1 1 1 

OU MRRLNRKXTL SLVKELDAFP KVPESYVETS ASGGTVSLIA FTTMALLTIM EFSVYQDTWM 60 

KYEYEVDKDF SSKLRINIDI TVAMKCQYVG ADVLDLAETM VASADGLVYE PTVFDLSPQQ 120 

KEWQRMLQLI QSRLQEEBSL QDVIPKSAFK ST ST ALP PRE DDSSQSPNAC RIHGHLYVNK 180 

VAGNFHTTVG KAIPHPRGHA HLAALVNHES YNFSHRIDHL SFGELVFAZI NPLDGTEKIA 240 

IDHNQMPQVF ITWPTKLHT YKISADTHQF SVTERERIIN HAAGSHGVSG IFMKYDLSSL 300 

65 MVTVTEEHMP FWQFFVRLCG IVGGIFSTTG MLHGIGKFIV EIIOCRFRLG SYKPVNSVPF 360 

EDGHTDNHLP LLENNTH 

SEQ ID N 0.223 PEZ3 DNA SEQUENCE 

70 Nucleic Add Accession #: NM.001 935. t 

Coding sequence: 76*2301 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

75 | | | | I | 

CGCGCGTCTC CGCCGCCCGC GTGACTTCTG CCTGCGCTCC TTCTCTGAAC GCTCACTTCC 60 

GAGGAGACGC CGACGATGAA GACACCGTGG AAGATTCTTC TGGGACTGCT GGGTGCTGCT 120 

GCGCTTGTCA CCATCATCAC CGTGCCCGTG GTTCTGCTGA ACAAAGGCAC AGATGATGCT 180 

ACAGCTGACA GTCGCAAAAC TTACACTCTA ACTGATTACT TAAAAAATAC TTATAGACTG 240 

80 AAGTTATACT CCTTAAGATG GATTTCAGAT CATGAATATC TCTACAAACA AGAAAATAAT 300 
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ATCTTGGTAT TCAATGCTGA ATATGGAAAC AGCTCAGTTT TCTTGGAGAA CAGTACATTT 360 

GATGAGTTTG GACATTCTAT CAATGATTAT TCAATATCTC CTGATGGGCA GTTTATTCTC 420 

TTAGAATACA ACTACGTGAA GCAATGGAGG CATTCCTACA CAGCTTCATA TGACATTTAT 480 

GATTTAAATA AAAGGCAGCT GATTACAGAA GAGAGGATTC CAAACAACAC ACAGTGGGTC 540 

ACATGGTCAC CAGTGGGTCA TAAATTGGCA TATGTTTGGA ACAATGACAT TTATGTTAAA 600 

ATTGAACCAA ATTTACCAAG TTACAGAATC ACAT6GACGG GGAAAGAAGA TATAATATAT 660 

AATGGAATAA CTCACTGGGT TTATGAAGAG GAAGTCTTCA GTGCCTACTC TGCTCTGTGG 720 

TGGTCTCCAA ACGGCACTTT TTTAGCATAT GCCCAATTTA ACGACACAGA AGTCCCACTT 780 

ATTGAATACT CCTTCTACTC TGATGAGTCA CTGCAGTACC CAAAGACTGT ACGGGTTCCA 840 

TATCCAAAGG CAGGAGCTGT GAATCCAACT GTAAAGTTCT TTGTTGTAAA TACAGACTCT 900 

CTCAGCTCAG TCACCAATGC AACTTCCATA CAAATCACTQ CTOCT Q CTTC TATGTTGATA 960 

GGGGATCACT ACTTGTGTGA TGTGACATGG GCAACACAAG AAAGAATTTC TTTGCAGTGG 1020 

CTCAGGAGGA TTCAGAACTA TTCGGTCATG GATATTTGTG ACTATGATGA ATCCAGTGGA 1080 

AGATGGAACT GCTTAGTGGC ACGGCAACAC ATTGAAATGA GTACTACTGG CTGGGTTOGA 1140 

AGATTTAGGC CTTCAGAACC TCATTTTACC CTTGATGGTA ATAGCTTCTA CAAGATCATC 1200 

AGCAATGAAG AAGGTTACAG ACACATTTGC TATTTCCAAA TAGATAAAAA AGACTGCACA 1260 

TTTATTACAA AAGGCACCTG GGAAGTCATC GGGATAGAAG CTCTAACCAG TGATTATCTA 1320 

TACTACATTA GTAATGAATA TAAAGGAATG CCAGGAGGAA GGAATCTTTA TAAAATCCAA 1380 

CTTATTGACT ATACAAAAGT GACATGCCTC AGTTGTGAGC TGAATCCGGA AAGGTGTCAG 1440 

TACTATTCTG TGTCATTCAG TAAAGAGGCG AAGTATTATC AGCTGAGATG TTCCGGTCCT 1500 

GGTCTGOCCC TCTATACTCT ACACAGCAGC GTGAATGATA AAGGGCTGAG AGTCCTGGAA 1560 

GACAATTCAG CTTTGGATAA AATGCTGCAG AATGTCCAGA TGCCCTCCAA AAAACTGGAC 1620 

TTCATTATTT TGAATGAAAC AAAATTTTGG TATCAGATGA TCTTCCCTCC TCATTTTGAT 1680 

AAATCCAAGA AATATCCTCT ACTATTAGAT GTGTATGCAG GCCCATGTAG TCAAAAAGCA 1740 

GACACTGTCT TCAGACTGAA CTGGGCCACT TACCTTGCAA GCACAGAAAA CATTATAGTA 1800 

GCTAGCTTTG ATGGCAGAGG AAGTGGTTAC CAAGGAGATA AGATCATGCA TGCAATCAAC 1860 

AGAAGACTGG GAACATTTGA AGTTGAAGAT CAAATTGAAG CAGCCAGACA ATTTTCAAAA 1920 

ATGGGATTTG TGGACAACAA ACGAATTGCA ATTTGGGGCT GGTCATATGG AGGGTACGTA 1980 

ACCTCAATGG TCCTGGGATC GGGAACTGGC GTGTTCAAGT GTGGAATAGC CGTGGCGCCT 2040 

GTATCCCGGT GGGAOTACTA TGACTCAGTG TACACAGAAC GTTACATGGG TCTCCCAACT 2100 

CCAGAAGACA ACCTTGACCA TTACAGAAAT TCAACAGTCA TGAGCAGAGC TGAAAATTTT 2160 

AAACAAGTTG AGTAOCTCCT TATTCATGGA ACAGCAGATG ATAACGTTCA CTTTCAGCAG 2220 

TCAGCTCAGA TCTCCAAAGC CCTGGTCGAT OTTGGAGTGG ATTTCCAGGC AATGTGGTAT 2280 

ACTGATGAAG ACCATGGA AT AGCTAGCAGC ACAGCACACC AACATATATA TACCCACATG 2340 

AGCCACTTCA TAAAACAATG TTTCTCTTTA CCTTAGCACC TCAAAATACC ATGCCATTTA 2400 

AAGCTTATTA AAACTCATTT TTGTTTTCAT TATCTCAAAA CTGCACTGTC AAGATGATGA 2460 

TGATCTTTAA AATACACACT CAAATCAAGA AACTTAAGGT TACCTTTGTT CCCAAATTTC 2520 

ATACCTATCA TCTTAAGTAG GGACTTCTGT CTTCACAACA GATTATTACC TTACAGAAGT 2580 

TTGAATTATC CGGTCGGGTT TTATTGTTTA AAATCATTTC TGCATCAGCT GCTGAAACAA 2640 

CAAATAGGAA TTGTTTTTAT GGAGGCTTTG CATAGATTCC CTGAGCAGGA TTTTAATCTT 2700 

TTTCTAACTG GACTGGTTCA AATGTTGTTC TCTTCTTTAA AGGGATGGCA AGATGTGGGC 2760 

AGTGATGTCA CTAGGGCAGG GACAGGATAA GAGGGATTAG GGAGAGAAGA TAGCAGGGCA 2820 

TGGCTGGGAA CCCAAGTCCA AGCATACCAA CACGAGCAGG CTACTGTCAG CTCCCCTCGG 2880 

AGAAGAGCTG TTCACCACGA GACTGGCACA GTTTTCTGAG AAAGACTATT CAAACAGTCT 2940 

CAGGAAATCA AATATCGAAA GCACTGACTT CTAAGTAAAC CACAGCAGTT GAAAGACTCC 3000 

AAAGAAATGT AAGGGAAACT GCCAGCAACG CAGCCCCCAG GTGCCAGTTA TGGCTATAGG 3060 

TGCTACAAAA ACACAGCAAG GGTGATGGGA AAGCATTGTA AATGTGCTTT TAAAAAAAAA 3120 

TACTGATGTT CCTAGTGAAA GAGGCAGCTT GAAACTGAGA TGTGAACACA TCAGCTTGCC 3180 

CTGTTAAAAG ATGAAAATAT TTGTATCACA AATCTTAACT TGAAGGAGTC CTTGCATCAA 3240 

TTTTTCTTAT TTCATTTCTT TGAGTGTCTT AATTAAAAGA ATATTTTAAC TTCCTTGGAC 3300 

TCATTTTAAA AAATGGAACA TAAAATACAA TGTTATGTAT TATTATTCCC ATTCTACATA 3360 
CTATGGAATT TCTCCCAGTC ATTTAATAAA TGTGCCTTCA TTTTTTC 



SEQ ID WO:224 PEZ3 Protein sequence: 

Protem Accession* NPJXM926.1 

1 11 21 31 41 51 

I I I I I I 

KKTPWKILLG LLGAAALVTI ITVPWLLNK GTDDATADSR KTYTLTDYLK NTYRLKLYSL 60 

RWISBHEYLY KQENNILVFN AEYGNSSVPL ENSTFDEFGH SINDYSISPD GQFILLEYNY 120 

VKQWRHSYTA SYDIYDLKKR QLITEERIFN NTQWVTWSPV GHKLAYVWNM DIYVKIEPNL 180 

PSYRITWTGK EDIIYNGITD WVYEEEVFSA YSALWWSPNG TFLAYAQFND TEVPLIEYSF 240 

YSDESLQYPK TVRVPYPKAG AVNPTVKPFV VNTDSLSSVT HATSIQITAP ASMLIGDHYL 300 

CDVTKATQER ISLQWLRRIQ NYSVMDICDY DESSGRWNCL VARQHIEMST TGWVGRFRPS 360 

BPHFTLDGNS FYKIISNEEG YRHICYFQID KKDCTFITKG TWEVIGIEAL TSDYLYYISN 420 

BYKGKPGGRN LYKIQLIDYT KVTCLSCEUJ PBRCQYYSVS FSKEAKYYQL RCSGPGLPLY 460 

TLHSSVNDKG LRVLEDNSAL DKHLQNVQMP SKKLDFIILN ETKFWYQMIL PPHFDKSKKY 540 

PLLLDVYAGP CSQKADTVFR LNWATYLAST ENIIVASFDG RGSGYQGDKI MHAINRRLGT 600 

FEVEDQIEAA RQFSKMGFVD NKRIAIWGWS YGGYVTSMVL GSGSGVFKCG IAVAPVSRWE 660 

YYDSVYTERY KGLPTPEDNL DHYRNSTVMS RAENFKQVEY LLIHGTADDN VHFQQSAQIS 720 
KALVDVGVDF QAKWYTDEDH GIASSTAHQH IYTHMSHFIK QCFSLP 

SEQ ID N0225 PBJ2 DNA SEQUENCE 

Nudeic Acid Accession* none found 

Coding sequence: 1-261 (underlined sequences correspond to start and stop codons) 



11 
I 



21 
I 



31 



41 



51 
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ATGGCTCTGG CGAAGGTGAG GGAGCCAAAC GCAAATGACA ATGCCATCAG AGTTGACAAC 60 

AGAAGTGTGA TTAAAGTGCG TGCTAACCAG T GT TCCCTGC ATGAGGCAGA AAGTGAATCC 120 

AGAAACCCTC AGGAGCTCTG GATGGGCCTG CTCCTCTTOA TGGGGGTCCT AGAAGCATGT 180 

GTGGAAATGA GGCCTCTGTC AGTCTGGTCC CTGAGAGATG ACAAGGAGCA GAGCCCCCAC 240 
CAGCCCACAC TGGATGT CTA A 



Protein Accessions none found 

l 11 21 31 41 51 

I I I I I I 

MALAKVREPN ANDNAIRVDN RSVIKVRANQ CSLHEAESES RNPQELWMGL LLUiGVLEAC 60 
VEMRPLSVWS LRDDKEQSPH QPTLDV 

SEQ ID NO:227 PBM2 DMA SEQUENCE 

Nucteic Add Accession*: none found 

Coding sequence: 1-462 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

ATGCCAAATG CTGAGTTAGA AGCAAAGAGC CTTGGAAGCA GTAAATGTTT AAAAACTGCT 60 

CTCATACTTG CTGTATGTTG TGGATCAGCA AATATAGTCA GCCCTCTACT TGAGCAAAAT 120 

ATTGATGTAT CTTCTCAAGA TCTGGACAGA CGGCCAGAGA GTATGCTGTT TCTAGTCATC 180 

ATCATGTGGA CCAGTTTTGT GGAAGACAAT CTTTCCATGG GCTGGGGGAA GCTAGAAGAT 240 

TTTATGGCTA TTGAAGAAGA AATCAAGAAG CACGGAAGTA CTCATGTGGC ATTCCCAGAA 300 

AACCTGACTA ATGGTGCCGC TGCTGGCAAT GGTGATGATG GATTAATTCC TCCAAGGAAG 360 

AGCAGAACAC CTGAAAGCCA GCAATTTCCT GACACTGAGA ATGAAGAGTA TCACAGGTTT 420 
GTCAAAOATC AGATAGTTGT AGATATGCGG CGTTATTT CT GA 

SEQ ID NQ22B PBM2 Protein sequence: 

Protein Accession*: none found 



1 11 21 31 41 51 

I I I I I I 

MENAELEAKS LGSSKCLKTA LILAVCCGSA NIVSPLLEQN IDVSSQDLDR RPESMLFLVI 60 
IKWTSFVEDN LSMGWGKLED FMAXEEEMKK HGSTHVGFPE NLTNGAAAGK GCDGLIPPRK 120 
SRTPESQQFP DTENEEYHRF VKDQIWDMR RYF 

SEQ ID K0229 PEZ2 DNA SEQUENCE 

Nucleic Add Accession ft NM_0 14253 

Cocfing sequence: 6SS242 (underlined sequences correspond to start and step codons) 



1 11 21 

! i i 

GACTGCTTGC ATTAAAGGAC TTCCTCATCC 
AGAGATGGAG CAAACTGACT GCAAACCCTA 
GGATCTAGCT TACACCAGTT CTTCTGATGA 
ATACAACTCC AGGGAGACCC TGCACGAGTA 
CCAGAGTAGA AAGAGGAAAG AAGTAGAAAA 
CTCTCACACT CTGTGCTCTG GCTACCAAAC 
CCAGCTAGAG ATGGGATCTG ATGTGGACAC 
TGCACTAAGA ATGTGGATAA GGGGAATGAA 
GGCCAACTCT GCATTATCCT TGACTGACAC 
TGGTTTCAAA TTCTCTCCTG TTTGTTGTGA 
TGTGCAGAGC AGOCCACACA ACCAGTTCAC 
TCCTCATGCC TGCACCTGTG CCAGGAAGCC 
ATCAATGACT ACCCGCAGCC AGCCCAGCCC 
GGATTCAGTC CATCTGCATA ACAGCTGGGT 
GCATTCCCTG TTCAAACATG GATCTGGTTC 
CTAOCCTCTG ACATCCAATA CCGTGTACTC 
CTTTTCCCGA CCTGCCTTTA CCTTTAACAA 
AGCATTGAGC GCCACTGCAA TCACAGTGAC 
AGTGCATTTG TTCGGCCTGA CTTGGCAGTT 
TGGAGTTAGC AAAGGGAACA GGGGGACCGA 
AGGAAAAGTT TCTGATAAAT CAGAGAAAAA 
TGGAGAAGTT GACATTGGTG CACAGGTCAT 
TTTCCAGATT ACTATCCACC ATCCAATATA 
CTCTCTGCTG GGAATTTATG GCAGAAGAAA 
TGTAAAACTA ATGGATGGCA AACAGCTGGT 
ACAGCACTCC CCTCGGAACC TGATCTTAAC 
TATGGATCAA GGACCTTGGT ATCTGGCGTT 
ATTCGTGTTA ACTACAGCAA TTGAAATAAT 
TGGAGAGTGT ATCTCTGGCC ATTGTCATTG 
TAGAGATTCC TGCCCTGTGC TGTGTGGTGG 
CTGCCGGCAT GGCTGGAAGG GGCCAGAGTG 
AACATGCTTT GGCCACGGCA CCTGCATCAT 



31 4i 51 

I I i 

1TTTTTTCAT GAAACTGAGC 1TGCTTAATC 60 

CCAGCCTCTA CCAAAAGTCA AGCATGAAAT 120 

GAGTGAAGAT GGAAGAAAAC CAAGACAGTC 180 

TAACCAGGAG CTGAGGATGA ATTACAAIAG 240 

ATCTACTCAA GAGATGGAAT TCTGTGAAAC 300 

AGACATGCAC AGCGTTTCTC GGCATGGCTA 360 

AGAGACAGAA GGTGCTGCCT CACCTGACCA 420 

ATCAGAGCAT AGTTCCTGTT TGTCCAGCCG 480 

TGACCATGAA AGGAAGTCTG ATGGGGAAAA 540 

CATGGAGGCT CAAGCTGGGT CTACTCAAGA 600 

CTTCAGACCC CTCCCACCGC CACCTCCGCC 660 

ACCCCCTGCA GCGGACTCTC TTCAGAGGAG 720 

AGCTGCTCCA GCTCCCCCAA CCAGCACGCA 780 

CCTGAAGAGC AACATACCAT TGGAGACCAG 840 

CTCTGCGATC TTCAGTGCAG CCAGTCAGAA 900 

GCCCCCTCCC AGGCCTCTTC CTCGAAGCAC 960 

ACCTTACAGG TGCTGCAACT GGAAGTGCAC 1020 

TTTGGCCTTG TTACTAGCCT ATGTGATTGC 1080 

GCAACCAGTT GAAGGAGAGC TGTATGCAAA 1140 

GTCCATGGAC ACTACTTACT CTCCAATTGG 1200 

AGTGTTTCAG AAGGGACGGG CGATAGACAC 1260 

GCAGACCATT CCACCTGGTT TATTCTGGCG 1320 

TCTGAAGTTC AATATTTCTT TAGCCAAGGA 1380 

CATTCCACCT ACACATACTC AGTTTGATTT 1440 

CAAGCAGGAC TCCAAGGGCT CTGATGATAC 1500 

TTCGCTTCAG GAGACAGGTT TCATAGAGTA 1560 

TTACAATGAT GGAAAAAAGA TGGAGCAAGT 1620 

GGATGACTGT TCAACCAATT GCAATGGAAA 1680 

TTTCCCAGGA TTOCTTGGAC CTGACTGTGC 1740 

GAATGGAGAA TACGAGAAAG GACACTGTGT 1800 

TGACGTTCCG GAAGAACAAT GCATTGATCC 1860 

GGGAGTCTGC ATCTGTGTGC CAGGATACAA 1920 
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AGGAGAAATA TGCGAGGAAG AGGACTGCCT AGACCCAATG TGTTCCAACC ATGGCATCTG 1980 

TGTAAAAGGA GAATGTCACT GTTCTACTGG CTGGGGAGGA GTTAACTGTG AAACACCACT 2040 

TCCTGTATGT. CAAGAGCAGT GCTCAGGACA CGGAACTTTT CTTCTGGACG CTGGAGTATG 2100 

CAGCTGTGAT CCCAAGTGGA CAGGATCTGA CTGCTCAAGA GAGCTGTGTA CCATGGAGTG 2160 

TGGTAGCCAT GGAGTCTGCT CAAGAGGAAT TT6CCAGTGT GAAGAAGGCT GGGTAGGACC 2220 

AACATGTGAG GAACGCTCCT GTCATTCTCA TTGTACTGAG CATGGCCAAT GCAAAGATGG 2280 

AAAATGTGAG TGTAGCCCTG GATGG6AGGG CQACCACTGC ACAATTGCTC ACTACTTAGA 2340 

TGCTGTCCGA GATGGCTGCC CAGGGCTCTG CTTTGGAAAT GGACGATGTA CCCTGGATCA 2400 

AAATGGTTGG CACTGTGTGT GTCAGGTGGG TTGGAGTGGG ACAGGCTGCA ATGTTGTCAT 2460 

GGAAATGCTT TGTGGAGATA ACTTGGACAA TGATGGAGAT GGTTTAACCG ACTGTGTGGA 2520 

TCCTGACTGT TGTCAACAAA GCAACTGTTA TATAAGTCCT CTCTGCCAGG GCTCACCAGA 2580 

TCCTCTTGAC CTCATTCAGC AAAGCCAAAC TCTCTTCTCT CAGCACACTT CAAGACTTTT 2640 

TTATGATCGA ATCAAATTCC TCATTGGCAA GGACAGTACT CATGTCATTC CTCCTGAGGT 2700 

GTCATTTGAC AGCAGGCGTG CCTGTGTGAT TCGAGGCCAA GTGGTGGCCA TAGATGGAAC 2760 

TCCTCTAGTG GGAGTGAATG TCAGTTTCTT GCACCACAGT GATTATGGGT TTACCATCAG 2820 

CCGGCAAGAT GGAAGCTTTG ACCTCGTGGC CATCGGTGGC ATCTCTGTCA TCTTAATCTT 2880 

CGACCGATCC CCTTTCCTGC CTGAGAAGAG AACACTCTGG TTGCCTTGGA ATCAGTTTAT 2940 

TGTGGTAGAG AAAGTCACCA TGCAGAGAGT TGTATCAGAC CCGCCATCCT GCGATATCTC 3000 

CAACTTTATC AGCCCAAACC CTATTGTGCT TCCTTCACCG CTCACATCAT TTGGAGGGTC 3060 

CTGTCCAGAG AGGGGAACTA TTGTTCCTGA GCTGCAGGTT GTACAGOAGG AAATTCCCAT 3120 

TCCCTCCAGC TTTGTGAGGC TGAGTTACCT GAGCAGCCGC ACCCCTGGGT ATAAAACCCT 3180 

GCTACGGATC CTTCTGACAC ATTCAACGAT TCCCGTAGGC ATGATAAAAG TACACCTCAC 3240 

AGTAGCTGTG GAAGGGCGAC TCACACAGAA GTGGTTTC CC GCCGCAATTA ATCTTGTCTA 3300 

CACATTTGCT TGGAACAAGA CCGATATCTA TGGACAGAAG GTTTGGGGCC TGGCAGAGGC 3360 

TTTGGTATCT GTGGGATATG AATATGAAAC GTGCCCTGAC TTTATTCTCT GGGAGCAAAG 3420 

GACAGTCGTT TTACAAGGTT TTGAGATGGA TGCTTCTAAC CTAGGAGACT GGTCTTTGAA 3480 

TAAGCATCAC ATTTTGAATC CTCAAAGTGG AATCATACAT AAAGGGAATG GAGAAAATAT 3540 

GTTCATTTCC CAGCAGCCCC CAGTCATATC AACCATAATG GGTAATGGAC ACCAAAGGAG 3600 

TGTAGCCTGC ACCAACTGCA ATGGCCCAGC CCACAACAAC AAACTCTTTG CTCCTGTCGC 3660 

CTTAGCTTCT GGCCCTGATG GCAGTGTGTA TGTTGGCGAC TTCAATTTTG TAAGGAGAAT 3720 

ATTTOCCTCG GGAAACTCCG TTAGTATTTT GGAATTAAGC ACAAGTCCTG CTCACAAATA 3780 

CTATCTGGCT ATGGACCCTG TQTCTGAATC ACTCTATCTA TCAGACACCA ATACTCGCAA 3840 

AGTCTACAAG TTGAAATCTC TTGTGGAGAC GAAAGATCTG TCCAAGAATT TTGAAGTGGT 3900 

GGCAGGAACT GGTGATCAGT GCCTTCCCTT TGACCAGAGT CATTGTGGAG ATGGTGGGAG 3960 

AGCATCGGAA GCTTCACTGA ATAGCCCTCG AGGCATCACA GTTGATAGGC ATGGATTTAT 4020 

TTACTTTGTG GATGGGACTA TGATTCGCAA AATTGATGAG AATGCTGTGA TCACAACTGT 4080 

AATCGGCTCA AATGGTCTGA CTTCCACACA ACCACTGAGC TGTGACTCAG GAATGGACAT 4140 

CACTCAGGTG CGATTAGAGT GGCCAACAGA CCTTGCAGTA AATCCTATGG ACAATTCATT 4200 

GTATGTCTTG GATAACAACA TTGTGCTGCA AATTTCTGAG AACAGGCGTG TTCGGATCAT 4260 

CGCAGGACGC CCCATTCACT GCCAGGTGCC AGGCATCGAT CATTTCCTGG TCAGCAAGGT 4320 

AGCAATTCAC TCCACTCTAG AGTCAGCGAG GGCCATCAGT GTCTCCCACA GCGGGCTGCT 4380 

CTTCATAGCT GAAACAGACG AGAGGAAAGT AAACCGCATT CAGCAAGTAA CCACCAATGG 4440 

GGAGATCTAC ATCATCGCTG GTGCCCCCAC TGACTGTGAC TGCAAAATTG ATCCAAACTG 4500 

TGACTGTTTT TCAGGTGATG GTGGCTATGC CAAAGATGCA AAGATGAAAG CCCCTTCCTC 4560 

CTTAGCAGTG TCGCCTGATG GAACCCTCTA TGTGGCAGAC CTCGGAAATG TTCGAATTCG 4620 

TACCATCAGC AGGAACCAAG CCCACCTGAA TGACATGAAC ATTTATGAGA TTGCTTCACC 4680 

OGCTGATCAG GAACTGTACC AGTTCACTGT AAATGGAACC CACCTACACA CCCTGAACTT 4740 

GATAACAAGG GACTATGTTT ATAACTTCAC CTACAATTCT GAAGGTGACT TGGGOGCGAT 4800 

TACCAGCAGC AATGGCAATT CAGTGCACAT TCGCCGTGAT GCAGGCGGAA TGCCGCTATG 4860 

GCTTGTGGTG CCTGGCGGAC AAGTATACTG GCTGACTATA AGCAGCAATG GAGTCCTGAA 4920 

AAGAGTGTCA GCCCAAGGCT ATAATCCGGC CTTAATGACC TATCCAGGAA ACACAGGGCT 4980 

TCTGGCTACC AAAAGTAACG AAAATGGATG GACAACCGTT TATGAGTATG ACCCOGAGGG 5040 

ACACCTGACC AATGCAACGT TTCCCACTGG AGAGGTCAGC AGCTTCCACA GTGADCTGGA 5100 

GAAGCTGACA AAAGTGGAGC TAGATACTTC CAACCGTGAA AATGTCCTCA TGTCAACCAA 5160 

CTTGACGGCA ACTAGTACCA TATATATTTT AAAACAAGAA AATACTCAAA GTACCTATCG 5220 

GGTGAATCCA GATGGTTCCC TGCGTGTCAC TTTTGOCAGC GGGATGGAGA TCGGCCTCAG 5280 

CTCAGAGCCC CACATCCTGG CAGGGGCAGT CAACCCTACC CTGGGCAAAT GCAACATCTC- 5340 

ATTGCCCGGA GAGCACAATG CAAACCTCAT CGAGTGGCGG CAGAGGAAGG AGCAAAACAA 5400 

AGGCAATGTT TCGGCTTTTG AAAGGAGGCT GAGGGCCCAC AACAGAAACC TACTCTCCAT 5460 

AGATTTTGAT CATATAACCC GCACAGGAAA GATCTATGAT GACCATCGAA AATTCACCCT 5520 

TCGAATTCTT TATGACCAGA CTGGGCGACC CATTCTGTGG TCTCCTGTAA GCAGATATAA 5580 

TGAAGTGAAC ATCACATATT CACCTTCGGG ATTGGTGACG TTTATTCAAA GAGGAACGTG 5640 

GAATGAAAAA ATGGAATATG ACCAGAGTGG GAAAATTA3T TCAAGAACTT GGGCTGATGG 5700 

GAAAATTTGG AGCTATACCT ACTTAGAAAA ATCTGTGATG CTTCTCCTAC ACAGCCAGCG 5760 

GCGTTACATC TTTGAGTATG ACCAATCAGA TTGCCTGCTG TCAGTTACCA TGCCTAGCAT 5820 

GGTCCCCCAC AGCTTACAAA CCATGCTTTC AGTGGGCTAC TACCGTAATA TCTACACCCC 5880 

ACCGGACAGT AGCACTTCTT TTATCCAAGA CTATAGTCGA GATGGCCGAT TGCTACAGAC 5940 

CCTGCATCTG GGGACAGGGC GCAGAGTCTT ATACAAGTAC ACCAAGCAAG CAAGGCTTTC 6000 

TGAGGTTCTC TATGATACCA CTCAGGTCAC ATTAACATAT GAAGAGTCTT CTGGAGTGAT 6060 

TAAGACAATA CACCTGATGC ATGACGGATT CATCTGCACA ATCAGATACA GGCAAACAGG 6120 

ACCTCTTATT GGACGCCAGA TTTTCAGATT CAGTGAAGAA GGCCTOGTGA ATGCACGGTT 6180 

OGACTACAGC TACAACAATT TCCGAGTCAC AAGCATGCAA GCTG7AATCA ATGAAACCCC 6240 

TTTGCCTATA GATCTTTACC GATATGTTGA TGTCTCTGGC AGAACAGAGC AGTTTGGAAA 6300 

ATTCAGTGTA ATTAATTACG ATTTAAATCA GGTCATAACT ACTACAGTGA TGAAACACAC 6360 

CAAAATCTTC AGTGCCAATG GACAAGTCAT TGAAGTCCAA TATGAAATCC TAAAGGCAAT 6420 

TGCCTACTGG ATGACCATTC AATATGATAA TGTGGGCCGA CATGGTAATA TGTGCATAAG 6480 

GGTAGGAGTA GATGCCAATA TAACAAGGTA CTTCTATGAA TACGATGCTG ATGGGCAACT 6540 

TCAGACTGTT TCTGTAAATG ACAAAACCCA GTGGCGTTAT AGTTACGATC TGAATGGAGA 6600 

CATCAACCTC TTAAGCCATG GGAAGAGTGC TCGTCTTACT CCTCTCCGAT ATGACCTCCG 6660 

AGACCGCATC ACCAGATTAG GAGAAATTCA GTATAAAATG GATGAAGATG GCTTTCTGAG 6720 
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GCAGAGGGGA AATGATATTT TTGAATATAA TTCTAATGGC CTGCTGCAGA AAGCCTACAA 6780 
TAAGGCTTCT GGCTGGACTG TGCAGTATTA CTATGATGGG CTTGGGCGAC GTGTCGCGAG 6840 
TAAGTCCAGC CTAGGGCAGC ACCTTCAGTT CTTTGTCGAC 6CGACC6CGA ACCCCATAAG 6900 
AGTTACTCAT TTGTACAACC ACACAAGCTC GGAGATTACA TCTCTGTATT ATGATCTCCA 6960 
AGGTCACCTT ATTGCCATGG AGTTAAGCAG TGGTGAAGAA TATTATGTAG CCTGTGATAA 7020 
TACAGGTACC CCACTAGCTG TGTTCAGCAG CCGAGGTCAG GTCATAAAGG AGATACTATA 7080 
CACACCTTAT GGCGATATCT ATCATGACAC TTACCCTGAC TTTCAGGTCA TAATTGGTTT 7140 
TCATGGAGGA CTCTATGATT TCCTTACTAA ATTAGTGCAC CTGGGGCAAA GGGATTATGA 7200 
TGTTGTTGCT GGCAGATGGA CAACGGCCTA TCATCACATA TGGAAACAGT TGAACCTCCT 7260 
TCCTAAACCA TTCAACCTCT ACTCCTTTGA AAATAACTAC CCAGTTGGCA AAATTCAAGA 7320 
TGTTGCAAAG TATACCACAG ACATCAGAAG TTGGTTGGAG CTATTOGGTT TCCAATTACA 7380 
CAATGTACTA CCTGGATTTC CCAAACCTGA ATTAGAAAAT TTAGAATTAA CTTACGAGCT 7440 
TCTACGGCTT CAGACAAAAA CTCAAGAGTG GGATCCTGGA AAGACTATCC TGGGCATTCA 7500 
GTGTGAACTC CAQAAACAGC TCAGGAATTT CATTTCCTTG GACCAACTAC CTATGACTCC 7560 
CCGATACAAT GATGGACGGT GCCTTGAAGG AGGGAAGCAA CCAAGGTTTG CTGCTGTCCC 7620 
TTCTGTTTTT GGGAAAGGTA TAAAATTTGC CATCAAGGAT GGCATAGTAA CAGCTGATAT 7680 
TATAGGAGTA GCCAATGAAG ATAGCAGGCG GCTTCCTGCC ATTCTCAATA ATGCCCATTA 7740 
CCTGGAAAAC CTACATTTTA OCATAGAGGG GAGGGACACT CACTACTTCA TTAAGCTTGG 7800 
GTCTCTCGAG GAAGACCTGG TGCTCATCGG TAACACTGGG GGGAGGCGGA TTCTGGAGAA 7860 
TGGTGTCAAT GTCACTGTGT CCCAGATGAC TTCTCTGTTG AATGGGAGGA CTAGACGGTT 7920 
TGCAGATATT CAGCTCCAGC ATGGAGCCCT GTGCTTCAAC ATCCGGTATG GGACAACTGT 7980 
CGAAGAGGAA AAGAATCACG TGTTGGAGAT TGCCAGACAG CGCGCAGTGG CCCAGGCCTG 8040 
GACTAAGGAA CAAAGAAGGC TGCAAGAGGG GGAAGAGGGG ATTAGGGCAT GGACAGAAGG 8100 
GGAAAAGCAG CAGCTTTTGA GCACTGGGCG GGTACAAGGT TACGATGGGT ATTTTGTTTT 8160 
GTCTGTTGAG CAGTATTTAG AACTTTCTGA CAGTGCCAAT AATATTCACT TTATGAGACA 8220 
GAGCGAAATA GGCAGGAG GT AAC AAAAATA TCTCTGCCTT TGCGTCACCA AAGACTGCCT 8280 
GTTTTTAAAA CATAAAATGG TTTATTGTAT TGGTTTTCTA GATCAGAACT CTGTATATGT 8340 
AAATATGGAG GAAAAACATA TCCAACTGCC TTTCAATGTG ACGGAAGATG GTATTTTAAT 8400 
ATTGTTTGTT TAAACTCTTT AAGAAATGAC AGAGATTTTT AGTTCTTGTG TGGCAGTATT 8460 
CAAAATAACA CAAGTAGAAC TCAAACAGCT AAAAACAGTT TTCAGAAAGC ACCACTTTCA 8520 
ATTTCCCGAG CCATGCATAT GTTCCAATAT CCAGAAAGAA CCCAAGGTTC TCTATCTCTA 8580 
TTGTGAGAAG CAGTTTCATC CTTAACTGTT GGCAGAACTT ACGGGCTATT TGAATAGGTG 8640 
GTGCAATAGT ATCTGAAACT TGCCTTTOGA AAGACTGCCA GCCCTTTGAC GTTTTCCAGA 8700 
TCTGTTATAG GAAACTTAAA AACAGGTGTA AAATGTCTTC AGCCACCATC TCCTAGAGTG 6760 
AGGACCCAAT TGCCCTTCCT TCTTGATTAT TCCTCCTTGC TTGTTAAAGT AAATGCCATA 8820 
TTGTTGTGCT GTGTTTTGGC GTGTGGTGGC TGGGTTCTGT CTACCATGCT TCCCTGTGGG 8880 
TGTGGTAACC AGACTGTATA GCCGCTATTT GCTCGTGTGT ACATGATACC AAAGCAGCTG 8940 
GCCAGCGTGA CCTCTCTCAC ACGACCTGTT TTGACTCAAT TTTTTACTAA AAGTTGTTCA 9000 
GCTGTA1TGG TATCATGTAA ACATAGCTTT TATTAACCTG GGTAGGAATT TCTCATTTAT 9060 
ATATAGGATG TGTTTTGGTC ATAGTTTCAC ATTAGTGATT CAGTATCTAT ACACTGACCC 9120 
AATGGTTTTG TGCACATGAA CGGTAATTTA CTTAAAAGTA TQATTCTGGT ACAAAAACAA 9180 
ACAAAGGCTT TAGCAGGCAT ACGTGTCTGG GATGCCGATA CATACATTAA CTACTACTGC 9240 
AGAAATTCAT AAGAGCCAAA ACCTTAAAAA AATAGACCTG GTACTTAAGT GAAAGTACTA 9300 
AAGGGAAGAC CAGACCAAAC ATCACAGCAG TTGCTGCCAC ATTGTTTCAG CCCACTTAGA 9360 
TTTATCTTTC AAATGTACAA TTCTGTATTG AACATCTCCC AGCCATCTTC AGGAAATCGA 9420 
ATCAAGTAAA TCCTTTCCAA CCGAAAACAT TTCAACTAAC TATAGAGAGG CAGACTCATT 9480 
TTTACTAAAA TAATTTATAC AGTTAGTTAT TTTCGTTCTC CGTACTTACC CATTTATCTT 9540 
TATTTAATCG TCTCTACTGC CTAGGAAAAT AACTATTTTC CAGGACGGGT TATTTGTTCT 9600 
GCGATCATTT AAAATTTGGA GAAAGGTCAG GATTAGTGTT AATATCAGCT GCAGTTTCTC 9660 
AATCTCTAGG AATCCTGCAG TAAAACAAGC CCCTTGGTGA GCTGGAAGAT TTGTGCCCAG 9720 
TGACAAAGAG ATAGTTTGTA AAATGCTGTG TAATTGTAAG TTACCACAAA TGAAAATACA 9780 
TGACAGCACA ATGTGGCOCG TAGAAAATTC CCCTGAGCCA GCTTCTGCAC TTTCATCACC 9840 
GAATCTGAAC ATTTGCTATG TCTGAAGGCA AATTTATGAT GGAATGTTAG TTTGGATTCT 9900 
TTCCAGATGC TACCTAAATG CAGTGTGGGG TCATTGCCTT GCTTTGCGAT GACAGTTTCT 9960 
TTGAAAATAT GCAAAGTCAT AAGCTCATGT TAAGGTTTTT CAAGAGTCTG CCTCCTACTA 10020 
CACAAAGGAA AGCAAGGGAA AGGAAATGAC CCTGGCAAAC AGTAGGGAAG GGTGTATTCA 10080 
AACATTTCAT TTTCAAAACC TTCGGGTTAG AATAOCACTT ACACATGTAT TCTGAGAGAC 10140 
AGAATTCATG AGGAACTCAT CTCTCTTTAT AACTGGAAAC ACACCAGCTT GATATATTGC 10200 
TAATCCATAC TAAAATCATA TTATTGGGTT TTTTCTGAAT CAGGCCTGTA TTAATGGTAC 10260 
AGTATTTATT CAGAATGGAA TTCTAAAATT ACTAACAAAC TTGTTGAAAA TTTGAATACC 10320 
TCCACACCAA CCTAAAAATG GACCTTAAGT TCCTAGAACC TCTGATGTTC TTTTAAATTA 10380 
ATGGAAAAAT AATTTGTGAA CTGTATATAG AGAGTGCATT CATAAATGTG ATTATGTATT 10440 
TTATCACAAA TCCAAAATGT CAATATTAGA GTCTATTTTG CTTATATTTT AAGCAATTAT 10500 
ACGTTTTTGC AATTCATTGA TGATGTATCA TTTTCAAACT GCTTTAAATA TCCATTAGAA 10560 
ACAAATATTT GAAGCTTTTA CTTAATAGTG ATTACCTTGA ACTGTGCATT TCTAGTTTGT 10620 
AATACGTATT TGGTTGGTTC GTGCCTTTAG TTTGTTAAAG TTACATTTGT ATTATATTCA 10660 
GGAAATGCAC TTTTTATTAC TTACAGCTGT GGTTTTAATA CTGCCTTGAA CTATTATTAT 10740 
TCTTTTTACA ACTCCTAAAG CTTGAGGGAG GAAAGAAAAA AAAAACAAAA CTACTAATCA 10800 
GTAGTAAATC GAAGAGAAAC ATTTTGGCAT TTCTTAAGAA GAAGATGGAG ATATTGAGTA 10860 
TATCACTTCC TATTCAGCTG AATAGAAAGA ATGCCTTCAT TGACTTGCAG TTCTGCAGTT 10920 
TAAATTATTG AAAGAACAAT TCGTTTGCAT TTCCTGATGA AAGTAAAAGC ATTTTTCAGA 10980 
GAAACATATG AATTTCTCAT ACCCAGCAGA CAGATGGCTG ACACTGCACA GCCACACACC 11040 
ATTCGAGTAA GTTAAAGTGA GAGCATAGTA GTTGGACTCT CCTATGAAGA ACATTCTGGG 11100 
CTGGAGGCAG GGAATACTCC ATGGTTGTTT CTTTTTCCTA CTTAAGCCCA TTTTGTTTGT 11160 
GCTTTTCTGT TTTGTTTTGT TTTCACTCTT GCACTACAGT CTAGAGATCC AAATGAACTG 11220 
AAAAGTTCAA AGTTTAACAC ATTTAAATAT GTTTACTTTT AGTTGTCATT CTAATCGTTA 11280 
TTGATTAGAA GCATGACTCC TGAAGGAAAG GGAAATAAAT CTCAATTCAT ACTAACTTGC 11340 
AACAAAACAC TTTTACCATA TAAATAAGTA TATGATTTAT TTTTAACCCA AAAAATGTAT 11400 
AAAATAAGTG TGTCCTTTAC TGTCAATTTA TCGAGAAGAT CTATAATATA TAGACTACAT 11460 
ATATATAATA TATACAACAT AGCCAAATGT ATGAAAACTT GACAATGTAT AATTTGGAAT 11520 
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TCACATGCTA CCTATGTAGA CAGGTATGAA ATTAAGTTAT AATTTTCATQ AGACATTTTC 11580 
ATCACTGTTG ACACAGTTTC AAGGCATTCC ATCATGTTAT TTTGACTCTT TTTCTTTTTT 11640 
TTTTCTTTAA AAATATATTT TTAACTAGAC CAGGCCOCAC TATAATATCA CTTAAGAGAG 11700 
TCAGGGCAAA GTTTTTGCAT TTATGAAGAT GTGTTCATGT AAGGGTGATT GTAATGGAGT 11760 
TCATTGGTAA TAGAAGCAAA AGTACAGTAA CGAAGTATTG AAAAGAAAAT TTTGGAGACA 11820 
TTGGAOCATA TTATATATAO CTTGTGGAAA GACATAAGGC TACAGATGGA ATGGAACATT 11880 
CCTGTTTTCT TGAAGAAATT CACATACACA TAGCTGACCT GACTAGTACT TCAGCTCTTC 11940 
CACAGCCTTC TATAAAGGTT CTTTCTTCTG CAAAGAAAAC AAAACAAAAC AAAACAAAAC 12000 
AAAAAAAAAC AAAAAAAGCG CAAAAAACAA AAAAACAAAA AAAAGCAAAG TAAAATTTAA 12060 
AAATACAGAA AACAAACAAC AAAAAAGAAT TCAACCATAA ATAGTGACTA TTATTTTCAG 12120 
TGTGTCCTTC ATGTGAAAGC TATTAAGGAC CAAATATACT ACTGTTCATA AGAAGAAATT 12180 
ACTTTCTAAA CAGTAACTGA AAATACTTAG AGTTAAACTT GCTGTGGATT TTGTCTTOGC 12240 
AGTTGTCATC TTACATTATT TGTCAAAGGA AATGTGTTTG GCAGTTAAAA ATCTTTCCTT 12300 
AGATTTAGTG GTGGACTTTA ACCTCTTAAA TAAATGTTAG TATATCAGAT TGTGTCCTTG 12360 
AAAAATATTT 1ACTTGTATG AATCATGACA ACGTCTAAAT CTTTACTATT CTTCTGGCAA 12420 
AAGCATCAGT AAGAAAGAAG GCGAAAAAGA GAAGTATAGC CTTTATGTCA GAAAAACATT 12480 
CTWTTAGCT GCTTACTTTC TCATGAAAAG TAAAGATGTT TACAGTGTAT GOCAAGTTTT 12540 
CAGTTTCTGT ATAACAACAG GTAGAGGTTC TAATCATATT GAAAATTGTG TTATAATGGT 12600 
CTGAGCCATG TTGCTAGGAA ACAATAGGTT CCAATTTTGT ATTCCTGCTC TCCTGTGCTG 12660 
AAAAGTGACT GGATACTGTA CAGGTTCATG TTCTCTGGCT GCAGTTAAAT GGTCTTTTGC 12720 
ATTTTGCTCT GGCTTTCAGG CCAGAAGCAT GCATTTTTCT ACAAGAGCAT CACAACAACA 12780 
TGCTGTAAAT ATTTAAAGTT AAACATTATG TGTTGATATT TGAAAGAAAA GTACTTTGAA 12840 
TATTTCATTT TTAAAAAATA AAATTGCCAA TGAAAAAAAA 



Protein Accession*: NP_055068 



1 11 21 31 41 51 

I I I 1 I I 

MEQTDCKFYQ PLPKVKHEMD LAYTSSSDES EDGRKPRQSY N5RETLHEYN QBLRMNYNSQ 60 

SRKRKEVEKS TQEMEFCETS HTLCSGYQTD MHSVSRHGYQ LEMGSDVDTE TEGAASPDHA 120 

LRMWIRGMKS EHSSCLSSRA NSALSLTDTD HERKSDGENG FKFSPVCCDM EAQAGSTQDV 180 

QSSPHNQFTP RPLPPPPPPP HACTCARKPP PAADSLQRRS MTTRSQPSPA APAPPTSTQD 240 

SVHLHNSWVL NSNIPLETRH SLFKHGSGSS AIFSAASQNY PLTSNTVYSP PPRPLPRSTP 300 

SRPAFTPNKP YRCCNWKCTA LSATAITVTL ALLLAYVIAV HLFGLTWQLQ PVEGELYANG 360 

VSKGNRGTES MDTTYSPIGG KVSDKSEKKV PQKGRAIDTG EVDIGAQVHQ TIPPGLPWRF 420 

QITIHHPIYL KFNISLAKDS LLGIYGRRNI PPTHTQPDFV KLMDGKQLVK QDSKGSDDTQ 480 

HSPRNLILTS LQETGPIEYM DQGPWYLAPY NDGKKMEQVF VLTTAIEIMD DCSTNCNGNG 540 

ECISGECHCF PGFLGPDCAR DSCPVLCGGN GEYEKGHCVC RHGWKGPECD VPEEQCIDPT 600 

CFGHGTCIMG VCICVPGYKG EZCEEEDCLD PHCSNHGICV KGECHCSTGW GGVNCETPLP 660 

VCQEQCSGHG TPLLDAGVCS CDPKWTGSDC STELCTMECG SHGVCSRGIC QCEEGWVGPT 720 

CEERSCHSHC TEHGQCKDGK CECSPGWEGD HCTIAHYLDA VRDGCPGLCF GNGRCTLDQN 780 

GWHCVCQVGW SGTGCNWME MLCGDNLEND GDGLTDCVDP DCCQQSNCYI SPLCQGSPDP 840 

LDLIQQSQTL FSQHTSRLFY DRZKFLIGKD STHVTPPEVS FDSRRACVIK GQWAHX3TP 900 

LVGVNVSFLH HSDYGFTISR QDGSFDLVAI GGISVZLIFD RSPFLPEKRT LWLFWNQFIV 960 

VEKVTMQRW SDPPSCDISN FISPNPIVLP SPLTSFGGSC PBRGTIVPEL GWQEEIPIP 1020 

SSFVRLSYLS SRTPGYKTLL RILLTHSTIP VGMUCVHLTV AVEGRLTQKW PPAAINLVYT 1080 

FAWNKTDIYC QKVWGLAEAL VSVGYEYETC PDFILWEQRT WLQGFEMDA SNLGDWSUJK 1140 

HHILNPQSGI XHRGNGENMF ISQQPPVIST IMGNGHQRSV ACTNCNGPAH NNKLFAPVAL 1200 

ASGPDGSVYV GDFNFVRRIF PSGNSVSILE LSTSPAHKYY LAMDPVSESL YLSDTNTRKV 1260 

YKLKSLVETK DLSKNFEWA GTGDQCLPFD QSHCGDGGRA SEASLNSPRG ITVDRHGFIY 1320 

FVDGTMIRKI DENAVITTVT GSNGLTSTQP LSCDSGMDIT QVRLEWPTDL AVNPMDNSLY 1380 

VLDNNIVLQI SENRRVRIIA GRPIHCQVPG IDHFLVSKVA IHSTLESARA ISVSHSGLLF 1440 

IAETDERKVN RIQQVTTNGE IYIIAGAPTD CDCKIDPNCD CFSGDGGYAK DAKMKAPSSL 1500 

AVSPDGTLYV ADLGNVRIRT ISRNOAHLND MNIYEIASPA DQELYQFTVN GTHLHTLNLI 1560 

TRDYVYNFTY NSEGDLGAIT SSNGNSVHIR RDAGGMPLWL WPGGQVYWL TIS5KGVLKR 1620 

VSAQGYNPAL MTYPGNTGLL ATKSNENGWT TVYEYDPEGH LTNATFPTGE VSSFHSDLEK 1680 

LTKVELDTSN RENVLMSTNL TATSTIYILK QENTQSTYRV NPDGSLRVTF ASGMEIGLSS 1740 

EPHILAGAVN PTLGKCNISL PGEHKANLIE WRQRKEQNKG NVSAFERRLR AHNRNLLSID 1800 

FDHTTRTGKI YDDHRKFTLR ILYDQTGRPZ LWSPVSRYNE VNITYSPSGL VTFIQRGTWN 1860 

EKMEYDQSQK IISRTWADGK IWSYTYLEKS VMLLLHSQRR YIFEYDQSDC LLSVTMPSMV 1920 

RHSLQTMLSV GYYRMIYTPP DSSTSFIQDY SRDGRLLQTL HLGTGRRVLY KYTKQARLSE 1980 

VLYDTTQVTL TYEESSGVIK TZHLHHDGFI CTIRYRQTGP LIGRQIFRFS EEGLVNARFD 2040 

YSYNNFRVTS HQAVZNETPL PIDLYRYVDV SGRTEQFGKF SVINYDLNQV ITTTVMKHTK 2100 

IFSANGQVIE VQYEILKAIA YWMTIQYDNV GRHGNHCIRV GVDANITRYF YEYDADGQLQ 2160 

TVSVNDKTQW RYSYDLNGDI NLLSHGKSAR LTPLRYDLRD RITRLGEIQY KHDEDGFLRQ 2220 

RGNDIFEYNS NGLLQKAYNK ASGWTVQYYY DGLGRRVASK SSLGQHLQFF VDATANPIRV 2280 

THLYNHTSSE ITSLYYDLQG HLIAKELSSG EEYYVACDNT GTPLAVFSSR GQVIKEILYT 2340 

PYGDIYHDTY PDFQVIIGFH GGLYDFLTKL VHLGQRDYUV VAGRWTTAYH HIWKQLNLLP 2400 

KPFNLYSFEN NYFVGKIQDV AKYTTDIRSW LELFGFQLHN VLPGFPKPEL ENLELTYELL 2460 

RLQTKTQEWD PGKTILGIQC ELQKQLRNFI SLDQLPWTPR YNDGRCLEGG KQPRFAAVPS 2520 

VFGKGIKFAI KDQIVTADII GVANEDSRKL AAILNNAHYL ENLHFTIEGR DTHYFIKLGS 2580 

LEEDLVLIGN TGGRRILENG VNVTVSQMTS LLNGRTRRFA DIQLCHGALC FNIRYGTTVE 2640 

EEKNHVLEIA RQRAVAQAWT XEQRRLQEGE EGIRAWTEGE KQQLLSTGRV QGYDGYFVLS 2700 
VEOYLELSDS ANNTHFMRQS EIGRR 

SEQIDN0:231 PFD40NA SEQUENCE 

Nuctefe Add Accession* KM.000441 
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WO 02/30268 



Coding sequence: 225-2567 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I II I 

CTCAGCCTTC COGGTTCGGG AAAGGGGAAG AATGCAGGAG GGGTAGGATT TCTTTCCTGA 60 

TAGGATCGGT TGGGAAAGAC CGCAGCCTGT GTGTGTCTTT CCCTTCGACC AAGGTGTCTG 120 

TTGCTCCGTA AATAAAACGT CCCACTGCCT TCTGAGAGCO CTATAAAGGC AGCGGAAGGG 160 

TAGTCCGCGG GGCATTCCGG GCGGGGCGCG AGCAGAGACA GGTCATGGCA GCGCCAGGCG 240 

GCAGGTCGGA GCCGCCGCA6 CTCCCCGAGT ACAGCTGCAG CTACATGGTG TCGCGGCCGG 300 

TCTACAGCGA GCTCGCTTTC CAGCAACAGC ACGAGCGGCG CCTGCAGGAG CGCAAGACGC 360 

TGCGGGAGAG CCTGGCCAAG TGCTGCAGTT GTTCAAGAAA GAGAGCCTTT GGTGTGCTAA 420 

AGACTCTTGT GCCCATCTTG GAGTGGCTCC CCAAATACCG AGTCAAGGAA TGGCTGCTTA 480 

GTGACGTCAT TTCGGGAGTT AGTACTGGGC TAGTGGCCAC GCTGCAAGGG ATGGCATATG 540 

CCCTACTAGC TGCAGTTCCT GTCGGATATG GTCTCTACTC TGCTTTTTTC CCTATCCTGA 600 

CATACTTTAT CTTTGGAACA TCAAGACATA TCTCAGTTGG ACCTTTTCCA GTGGTGAGTT 660 

TAATGGTGGG ATCTGTTGTT CTGAGCATGG CCCCCGACGA ACACTTTCTC GTATCCAGCA 720 

GCAATGGAAC TGTATTAAAT ACTACTATGA TAGACACTGC AGCTAGAGAT ACAGCTAGAG 780 

TCCTGATTGC CAGTGCCCTG ACTCTGCTGG TTGGAATTAT ACAGTTGATA TTTGGTGGCT 840 

TGCAGATTGG ATTCATAGTG AGGTACTTGG CAGATCCTTT GGTTGGTGGC TTCACAACAG 900 

CTGCTGCCTT CCAAGTGCTG GTCTCACAGC TAAAGATTGT CCTCAATGTT TCAACCAAAA 960 

ACTACAATGG AGTTCTCTCT ATTATCTATA CGCTGGTTGA GATTTTTCAA AATATTGGTG 1020 

ATACCAATCT TGCTGATTTC ACTGCTGGAT TGCTCACCAT TGTCGTCTGT ATGGCAGTTA 1080 

AGGAATTAAA TGATCGGTTT AGACACAAAA TCCCAGTCCC TATTCCTATA GAAGTAATTG 1140 

TGACGATAAT TGCTACTGCC ATTTCATATG GAGCCAACCT GGAAAAAAAT TACAATGCTG 1200 

GCATTGTTAA ATCCATCCCA AGGGGGTTTT TGCCTCCTGA ACTTCCACCT GTGAGCTTGT 1260 

TCTCGGAGAT GC?TGGCTGCA TCATTTTCCA TCGCTGTGGT GGCTTATGCT ATTGCAGTGT 1320 

CAGTAGGAAA AGTATATGCC ACCAAGTATG ATTACACCAT CGATGGGAAC CAGGAATTCA 1380 

TTSCCTTTGG GATCAGCAAC ATCTTCTCAG GATTCTTCTC TT G TTTTGTG GCCACCACTG 1440 

CTCTTTCCCG CACGGCCGTC CAGGAGAGCA CTGGAGGAAA GACACAGGTT GCTGGCATCA 1500 

TCTCTGCTGC GATTGTGATG ATCGCCATTC TTGCCCTGGG GAAGCTTCTG GAACCCTTGC 1560 

AGAAGTCGGT CTTGGCAGCT GTTGTAATTG CCAACCTGAA AGGGATGTTT ATGCAGCTGT 1620 

GTGACATTCC TCGTCTGTGG AGACAGAATA AGATTGATGC TGTTATCTGG GTGTTTACGT 1680 

GTATAGTGTC CATCATTCTG GGGCTGGATC TCGGTTTACT AGCTGGCCTT ATATTTGGAC 1740 

TGTTGACTGT GGTCCTGAGA GTTCAGTTTC CTTCTTGGAA TGGCCTTGGA AGCATCCCTA 1800 

GCACAGATAT CTACAAAAGT ACCAAGAATT ACAAAAACAT TGAAGAACCT CAAGGAGTGA 1860 

AGATTCTTAG ATTTTCCAGT CCTATTTTCT ATGGCAATGT CGATGGTTTT AAAAAATGTA 1920 

TCAAGTCCAC AGTTGGATTT GATGCCATTA GAGTATATAA TAAGAGGCTG AAAGCGCTGA 1980 

GGAAAATACA GAAACTAATA AAAAGTGGAC AATTAAGAGC AACAAAGAAT GGCATCATAA 2040 

GTGATGCTGT TTCAACAAAT AATGCTTTTG AGCCTGATGA GGATATTGAA GATCTGGAGG 2100 

AACTTGATAT CCCAACCAAG GAAATAGAGA TTCAAGTGGA TTGGAACTCT GAGCTTCCAG 2160 

TCAAAGTGAA CGTTCCCAAA GTGCCAATCC ATAGCCTTGT GCTTGACTGT GGAGCTATAT 2220 

CTTTCCTGGA CGTTGTTGGA GTGAGATCAC TGCGGGTGAT TGTCAAAGAA TTCCAAAGAA 2280 

TTGATGTGAA TGTGTATTTT GCATCACTTC AAGATTATGT GATAGAAAAG CTGGAGCAAT 2340 

GCGGGTTCTT TGACGAGAAC ATTAGAAAGG ACACATTCTT TTTGACGGTC CATGATGCTA 2400 

TACTCTATCT ACAGAACCAA GTGAAATCTC AAGAGGGTCA AGGTTCCATT TTAGAAACGA 2460 

TCACTCTCAT TCAGGATTGT AAAGATACCC TTGAATTAAT AGAAACAGAG CTGACGGAAG 2520 

AAGAACTTGA TGTCCAGGAT GAGGCTATGC GTACACTTGC ATCCTGAAAG TGGGTTCGGG 2580 

AGGTCTCTAT GAGCAAGGAA TACAAGACAA AACTTCCTCA ATGCATTGAC TATTTCTTCA 2640 

GACTCAAAAC ACTCATTCTT TTTTCTATTA AGCCATTGAA AGAGAAGCAC TAAGACTGCT 2700 

TCTAGGCTTT ATTTATAAAA TAAACACCTT ATCCCTAACA TGGGCAAAAT GGCTAGAATT 2760 

ATTCAGACGA TTTGGCAGCG TCCAGGGTAA GCTGGTGTTA TAATACGCTG CTGATCTACA 2820 

TCACAGATTT GCTAATAATG TTCACGTGGG CCCTGGCATA TCTCTGTTCA GTTAGAGTGA 2880 

GTGCTGACCC AACAGCCTCT GTGGTCAAGC GAGTCACGAA TGATTAATCA TAAAGAAAAA 2940 

TCAGTTTTTG ACTGACCTGG ATATCCATGA GCTGCACTGA TCACCATGTA AGGTCACATT 3000 

TAGTAAATGC TGAAATAAAA TGATTAATGC ATTTATCAAT AAAAGCCTTT GAAAATACTT 3060 

TGGATAATAA ATTGGAGTTT TAAAAATGCA AATTTGCTTA GTATCTAATA ATGAAGTGTT 3120 

ATTACATATA GCCGGAATTG AGGATCTCTT TGATCCTGGA AATGGTTTAC CTAAAAGCTA 3180 

CAGAACCAGG CCAATATATT TTGAAATATT GATGCAGACA AATGAAATAA TAAAGAGATT 3240 

TTCATGGTTT ATAAAAATCT TTTTTGATAT GATAATAATC ATGATCACAA CTGAGATCAA 3300 

AAAAATATAT GACAGATTAT TTTGTTTAAA AATGCAGTTT TAATTATCTT AGTCTATAGA 3360 

AATGATCATT GCATGGAGGC ATGTATAGGT ATGATCTGTG TAAAATCTGA CATAAAAACA 3420 

GTGCTATTCT GAGTGAAAAT TTTTTTGATG TGCTTACATA ACCATGGTGA TTAAAATGAG 3480 

TTTATATTTT TTCTCAAAAA TTTTAGCAGT GTGTAAAGTA AGTAATCTTT AACTGAACTC 3540 

TGACCACTTA AAAAAAAATC TAAAAATTGA ACTACCTATA GTAGTCTGTG TTTAAAGTGA 3600 

ATTTTTAAAG ACAAAGCATT CTAAATGAAC TCAATATAAA AACATTCATT TGGAATGTAC 3660 

ATACTGAAAA ATACAGGTTT TTTTGACCAA AAGTTTTTAT ATCTTTTCTT TTTATTTATT 3720 

TTTTTCCTAA GTGCCAACAA TTTTCTAGAT ATTATATACA ACACAGGCTT TGATCTTGGG 3780 

GACTTTTCCC ATATATTTCA CACTGGAGTG AATGAAGTTG TACTTCATTT CTAGAGAAAA 3840 

GTTATACCCA GGTCCCCAAT TGAGAATGTC TTGCTTGATT GAAAACGACA TCATCCCTTG 3900 

GTATACTCCA GGGATTGGTT TCAGGACCCC TGCATTTACC AAAATTTGTG CACACTCAAG 3960 

TCCTGCAGTC ACCCCTGCCT AAAGATAGAA TGGCTTCTCT GTTTTTCTTC TGAAATACAA 4020 

CCAGAAACAA TGTGTCTATT TCTGAAAGAA TAGGATTAAT GATCATACAA ATGGGTTAAT 4080 

CCTGAATTCT GGTTGTAAAT CTGGTTACAG CATAACTAGG ATTATAATGC TGCCTCATTT 4140 

TCACAGCACT ACTTGCTTAT ATTGACAACA AATCATCTCG CTAAAGAGTG AATGTAGGCC 4200 

AGGCGCGGTG GCTCATGCCT GTAATCCCAG CACTTTGGGA GGCCGAGGCG GGTGGATCAC 4260 

GAGGTCAGGA GATCGAGACC ATCCTGGCTA ACATGGTAAA ACCCCGTCTC TACTAAAAAT 4320 

AGAAAAAAAG AAATTAGCCT AGCGTGGTGG CTGGCGGGCG CCTGTAGTCC CAGCTATTTG 4380 

GGAGGCTAAG GCAGGAGAAT GGCGTGAACC CGGGAGGCGG AGCTTGCAGT GAGCCGAGGT 4440 

CGTGCCACTG CACTOCAGCC TGGGCGACAG AGCAAGACTC CGTCTCAAAA AAAAAAAAAA 4500 
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AAAAAAAAAA AGAGTGAATG TAATAGTCTT GCAGAAAATG AATGAATACC TTTGTTCAAT 4560 

AAAGGAAATA TGCACTGCTC ACTTTTTTGA AGGAAATGOC AAAGTTACGT TTTACAACAA 4620 

GGCTAGAGTT TGTAAATTCT GGGTTCATTT GTGATGACAT AAG'TCAGCAA ACTGCGGGAA 4680 

TACTGTCTCT TCTATGTATT TTGTGAATAG TAAGCATAAT TTTAGTTTTG TATTATCAAT 4740 

GAAAATTTCA CTTGAAATTA AAGCTGCCTT TTGTTATATT TTTAACCTAT AGGATAAGAT 4800 

TCCAGTATTG TATATGAGTT TTAACAAATT AAAAAATCAA ATCATGTACA TTTGAAAATA 4860 

TTTGCACACA TTTAAAAATA AATGTAAAGT TGTCTTTTAA ACTACTCGGA TGTGTCCTTT 4920 
CTGAACAAAA 



P NfrZB PFP4 Protein gwverice; 
Protein Accession*: 043511 



1 11 21 31 41 51 

I I I I I I 

MAAPGGRSEP PQLPEYSCSY MVSRPVYSEL AFQOQHERKL QERKTLRESL AXCCSCSRKR 60 

AFGVLKTLVP ILBWLPKYRV KEWLLSDVIS GVSTGLVATL QGHAYALIAA VFVGYGLYSA 120 

PPPILTYPIP GTSRHISVGP FPWSLMVGS WLSMAPDEH FLVSSSNGTV LNTTWIDTAA 180 

RDTARVLIAS ALTLLVGIIQ LIFGGLQIGF IVRYLADPLV GGFTTAAAFQ VLVSQLKIVL 240 

KVSTKNYNGV LSIIYTLVEX FQNIGDTNLA DFTAGLLTIV VCMAVKELND RPRHKIFVPI 300 

PIEVrVTIIA TAISYGANLE KNYNAGIVKS IPRGFLPPEL PPVSLFSEML AASFSIAWA 360 

YAIAVSVGKV YATKYDYTID GNQEFIAPGI SNIFSGPPSC FVATTALSRT AVQESTGGKT 420 

QVAGIISAAI VMIAILALGK LLEPLQKSVL AAWIANLKG MFMQLCDIPR LWRQNKIDAV 480 

IWVPTCIVSI ILGLDLGLLA GLIFGLLTW LRVQFPSWNG LGSIPSTDIY KSTKNYKNIE 540 

EPQGVKILRP SSPIFYGHVD GFKKCIKSTV GFDAIRVYNK RLKALRKIQK LIKSGQLRAT 600 

KNGIISOAVS TONAFEFDED IEDLEELDIP TKEIEIQVDW NSELPVKVNV PKVPZHSLVL 660 

DCGAISFLDV VGVRSLRV2V KEFQRIDVNV YFASLQDYVI EKLEQCGFFD DNIHKDTFPL 720 

TVHDAILYLQ NQVKSQEGQG SILETITLIQ DCKDTIjELIE TELTEEELDV QDEAMRTLAS 780 
QDBKMRTLAS 

SEQ ID N0--233 PFH2 DMA SEQUENCE: 

Nucleic Add Accession «: NM.01 6029 

Coding sequence: 228-1097 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 
I I I I 1 I 

CTGCGATCCC GCAGGGCAGC GACGCGACTC TGGTGCGGGC CGTCTTCTTC CCCCCGAGCT 60 

GGGCGTGCGC GGCCGCAATG AACTGGGAGC TGCTGCTGTG GCTGCTGGTO CTGTGCGCGC 120 

TGCTCCTGCT CTTGGTGCAG CTGCTGCGCT TCCTGAGGGC TGACGGCGAC CTGACGCTAC 180 

TATGGGCCGA GTGGCAGGGA CGACGCCCAG AATGGGAGCT GACTGATATG GTGGTGTGGG 240 

TGACTGGAGC CTCGAGTGGA ATTGGTGAGG AGCTGGCTTA CCAGTTGTCT AAACTAGGAG 300 

TTTCTCTTGT GCTGTCAGCC AGAAGAGTGC ATGAGCTGGA AAGGGTGAAA AGAAGATGCC 360 

TAGAGAATGG CAATTTAAAA GAAAAAGATA TACTTGTTTT GCCCCTTGAC CTGACCGACA 420 

CTCGTTCCCA TGAAGCGGCT ACCAAAGCTG TTCTCCAGGA GTTTGGTAGA ATCGACATTC 480 

TGGTCAACAA TGGTGGAATG TCCCAGCGTT CTCTGTGCAT GGATACCAGC TTGGATGTCT 540 

ACAGAAAGCT AATAGAGCTT AACTACTTAG GGACGGTGTC CTTGACAAAA TGTGTTCTGC 600 

CTCACATGAT CGAGAGGAAG CAAGGAAAGA TTGTTACTGT GAATAGCATC CTGGGTATCA 660 

TATCTGTACC TCTTTCCATT GGATACTGTG CTAGCAAGCA TGCTCTCCGG GGTTTTTTTA 720 

ATGGCCTTCG AACAGAACTT GCCACATACC CAGGTATAAT AGTTTCTAAC ATTTGCCCAG 780 

GACCTGTGCA ATCAAATATT GTGGAGAATT CCCTAGCTGG AGAAGTCACA AAGACTATAG 840 

GCAATAATCG AGACCAGTCC CACAAGATGA CAACCAGTCG TTGTGTGCGG CTGATGTTAA 900 

TCAGCATGGC CAATGATTTG AAAGAAGTTT GGATCTCAGA ACAACCTTTC TTGTTAGTAA 960 

CATATTTGTG GCAATACATG CCAACCTGGG CCTGGTGGAT AACCAACAAG ATGGGGAAGA 1020 

AAAGGATTGA GAACTTTAAG AGTGGTGTGG ATGCAGACTC TTCTTATTTT AAAATCTTTA 1080 

AGACAAAACA TGACTGAAAA GAGCACCTGT ACTTTTCAAG CCACTGGAGG GAGAAATGGA 1140 

AAACATGAAA ACAGCAATCT TCTTATGCTT CTGAATAATC AAAGACTAAT TTGTGATTTT 1200 

ACTTTTTAAT AGATATGACT TTGCTTCCAA CATGGAATGA AATAAAAAAT AAATAATAAA 1260 
AGATTGCCAT GAATCTTGCA AA 



seq »p wjm rm Proton m 

Protein Accession!: NPJJ57113 



1 11 


21 
I 


31 


41 


51 




1 1 

HNVJSLLLWLL VLCALLLLLV 


QLLRPLRADG 


1 

DLTLLWAEWQ 


1 

GRRPEWELTD 


1 

MWWVTGASS 


60 


GIGEELAYQL SKLGVSLVLS 


AHRVHELERV 


KRRCLEKGNL 


KEKDILVLPL 


DLTDTGSHEA 


120 


ATKAVLQEPG R1DI LVNNGG 


HSQRSLCMDT 


SLDVYRKLIE 


LNYLGTVSLT 


KCVLPHMIER 


180 


KQGKIVTVNS ILGIISVPLS 


IGYCASKHAL 


RGFFNGLRTE 


LATYPGIIVS 


NICPGFVQSN 


240 


IVENSLAGEV TKTIGNNGDQ 


SHKMTTSRCV 


RLMLISMAND 


LKEVWISEQP 


FLLVTYLWQY 


300 


MPTWAWWITN KHGKKRIENF 


RSGVDADSSY 


FKIFKTKHD 









SEQ ID N0:235 AX5 DMA SEQUENCE 

Nucleic Add Accession #: NM.00O45O 
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Cocfinfl sequence: 1-1 833 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I 1 

ATGATTGCTT CACAGTTTCT CTCAGCTCTC ACTTTGGTGC TTCTCATTAA AGAGAGTGGA 60 

GCCTGGTCTT ACAACACCTC CACGGAAGCT ATGACTTATG ATGAGGCCAG TGCTTATTGT 120 

CAGCAAAGGT ACACACACCT GGTTGCAATT CAAAACAAAG AAGAGATTGA GTACCTAAAC 180 

TCCATATTQA GCTATTCACC AAGTTATTAC TGGATTGGAA TCAGAAAAGT CAACAATGTG 240 

T6GGTCTGG6 TAGGAACCCA GAAACCTCTG ACAGAAGAAG CCAAGAACTG GGCTCCAGGT 300 

GAACCCAACA ATAGGCAAAA AGATGAGGAC TGCGTGGAGA TCTACATCAA GAGAGAAAAA 360 

GATGTGGGCA TGTGGAATGA TGAGAGGTGC AGCAAGAAGA AGCTTGCCCT ATGCTACACA 420 

GCTGCCTGTA CCAATACATC CTGCAGTGGC CACGGTGAAT GTGTAGAGAC CATCAATAAT 480 

TACACTTGCA AGTGTGACCC TGGCTTCAGT GGACTCAAGT GTGAGCAAAT TGTGAACTGT 540 

ACAGCCCTGG AATCCCCTGA GCATGGAAGC CTGGTTTGCA GTCACCCACT GGGAAACTTC 600 

AGCTACAATT CTTCCTGCTC TATCAGCTGT GATAGGGGTT ACCTGCCAAG CAGCATGGAG 660 

ACCATGCAGT GTATGTCCTC TGGAGAATGG AGTGCTCCTA TTCCAGCCTG CAATGTGGTT 720 

GAGTGTGATG CTGTGACAAA TCCAGCCAAT GGGTTCGTGG AATGTTTCCA AAACCCTGGA 780 

AGCTTCCCAT GGAACACAAC CTGTACATTT GACTGTGAAG AAGGATTTGA ACTAATGGGA 840 

GCCCAGAGCC 1TCAGTGTAC CTCATCTGGG AATTCGGACA ACGAGAAGCC AACGTGTAAA 900 

GCTGTGACAT GCAGGGCCGT CCGCCAGCCT CAGAATGGCT CTGTGAGGTG CAGCCATTCC 960 

CCTGCTGGAG AGTTCACCTT CAAATCATCC TGCAACTTCA CCTGTGAGGA AGGCTTCATG 1020 

TTGCAGGGAC CAGCCCAGGT TGAATGCACC ACTCAAGGGC AGTGGACACA GCAAATCCCA 1080 

GTTTGTGAAG CTTTCCAGTG CACAGCCTTG TCCAACCCCG AGCGAGGCTA CATGAATTGT 1140 

CTTCCTAGTG CTTCTGGCAG TTTCGGTTAT GGGTCCAGCT GTGAGTTCTC CTGTGAGCAG 1200 

GGTTTTGTGT TGAAGGGATC CAAAAGGCTC CAATGTGGCC CCACAGGGGA GTGGGACAAC 1260 

GAGAAGCCCA CATGTGAAGC TGTGAGATGC GATGCTGTCC ACCAGCCCCC GAAGGGTTTG 1320 

GTGAGGTGTG CTCATTCCCC TATTGGAGAA TTCACCTACA AGTCCTCTTG TGCCTTCAGC 1380 

TGTGAGGAGG GATTTGAATT ATATGGATCA ACTCAACTTG AGTGCACATC TCAGGGACAA 1440 

TGGACAGAAG AGGTTCCTTC CTGCCAAGTG GTAAAATGTT CAAGCCTGGC AGTTCCGGGA 1500 

AAGATCAACA TGAGCTGCAG TGGGGAGCCC GTGTTTGGCA CTGTGTGCAA GTTCGCCTGT 1560 

CCTGAAGGAT GGACGCTCAA TGGCTCTGCA GCTCGGACAT GTGGAGCCAC AGGACACTGG 1620 

TCTGGCCTGC TACCTACCTG TGAAGCTCCC ACTGAGTCCA ACATTCCCTT GGTAGCTGGA 1680 

CTTTCTGCTG CTGGACTCTC CCTCCTGACA TTAGCACCAT TTCTCCTCTG GCTTCGGAAA 1740 

TGCTTACGGA AAGCAAAGAA ATTTGTTCCT GCCAGCAGCT GCCAAAGCCT TGAATCAGAC 1800 
GGAAGCTACC AAAAGCCTTC TTACATCCTT TAA 



SEQ ID NQ:236 ACp$ Prflein sequence; 
Protein Accession* NP.000441 



1 11 21 31 41 51 

I I I I I I 

2HASQFLSAL TLVLLIKESG AHSYNTSTEA MTYDEASAYC QQRYTHLVAI QNKEEIEYLN 60 

SILSYSPSYY WIGIRKVNNV WVWVGTQKPL TEEAKNWAPG EPNNRQKDED CVEIYIKREK 120 

DVGMWNDERC SKKKLALCYT AACTNTSCSG H GE C V ETINH YTCKCDPGFS GLKCEQIVNC 180 

TALES PEHGS LVCSHPLGNF SYNSSCSISC DRGYLPSSME TMQCMSSGEW SAPIPACNW 240 

ECDAVTNPAN GFVECFQNPG SFPWNTTCTF DCEEGFELMG AQSLQCTSSG NWDNEKPTCK 300 

AVTCRAVRQP QNGSVRCSHS PAGEPTFKSS CNFTCEEGFK LQGPAQVECT TQGQWTQQIP 360 

VCKAFQCTAL SNPERGYMNC LPSASGSFRY GSSCEFSCEQ GFVLKGSKRL QCGPTGEWDN 420 

EKPTCEAVRC DAVHQPPKGL VRCAHSPIGE PTYKSSCAFS CEEGPELYGS TQLECTSQGQ 480 

WTEEVPSCQV VKCSSLAVPG KINMSCSGEP VFGTVCKFAC PEGWTLNGSA ARTCGATGHW 540 

SGLLPTCKAP TESNIPLVAG LSAAGLSLLT LAPFIiWLRK CLRKAKKFVP ASSCQSLESD 600 
GSYQKPSYIL 

SEQ ID N0237 PM28 DNA SEQUENCE 

Nucleic Add Accession #: N510Q2 

Coding sequence: 1-3793 (underflned sequences conrepond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

ATGATGTGTG AAGTGATGCC CACGATTAAT GAGGACACCC CAATGAGCCA AAGGGGGTCC 60 

CAAAGCAGTG GCTCGGACTC AGACTCCCAT TTTGAGCAGC TGATGGTGAA TATGCTAGAT 120 

GAAAGGGATC GTCTTCTAGA CACCCTTCGG GAGACCCAGG AAAGCCTCTC ACTTGCCCAG 180 

CAAAGACTTC AGGATGTCAT CTATGACCGA GACTCACTCC AG AG AC AG CT CAATTCAGCC 240 

CTGOCACAGG ATATCGAATC CCTAACAGGA GGGCTGGCTG GTTCTAAGGG GGCTGATCCA 300 

CCGGAATTTG CTGCACTGAC AAAAGAATTA AATGCCTGCA GGGAACAACT TCTAGAAAAG 360 

GAAGAAGAAA TCTCTGAACT TAAAGCTGAA AGAAACAACA CAAGACTATT ACTGGAGCAT 420 

TTGGAGTGCC TTGTGTCACG ACATGAAAGA TCACTAAGAA TGACGGTGGT AAAACGGCAA 480 

GCCCAGTCTC CCTCAGGAGT ATCCAGTGAA GTTGAAGTTC TCAAGGCACT GAAATCTTTG 540 

TTTGAGCACC ACAAGGCCTT GGATGAAAAG GTAAGGGAGC GACTGAGGGT TTCTTTAGAA 600 

AGAGTCTCTG CACTGGAAGA AGAACTAGCT GCTGCTAATC AGGAGATTGT TGCCTTGCGT 660 

6AACAAAATG TTCATATACA AAGAAAAATG GCATCAAGCG AGGGATCCAC AGAGTCAGAA 720 

CATCTTGAAG GGATGGAACC TGGACAGAAA GTCCATGAGA AGCGTTTGTC CAATGGTTCT 780 

ATAGACTCAA CCGATGAAAC TAGTCAAATA GTTGAACTAC AAGAATTGCT TGAAAAGCAA 840 

AACTATGAAA TGGCCCAGAT GAAAGAACGT TTAGCAGCCC TTTCTTCCCG AGTGGGAGAG 900 

GTGGAACAGG AAGCAGAGAC AGCAAGAAAG GATCTCATTA AAACAGAAGA AATGAACACC 960 

AAGTATCAAA GGGACATTAG GGAGGCCATG GCACAAAAGG AAGATATGGA AGAAAGAATT 1020 

ACAACCCTTG AAAAGCGTTA CCTCAGTGCT CAGAGAGAAT CTACCTCCAT ACATGACATG 1080 
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AATGATAAAC TAGAAAATGA GTTA6CAAAT AAAGAAGCTA TCCTACGGCA GATGGAAGAG 1140 

AAAAACAGAC AGTTACAAGA ACGTCTTGAG CTAGCTGAAC AAAAGTTGCA GCAGACCATG 1200 

AGAAAGGCTG AAACCTTGCC TGAAGTAGAG GCTGAACTGG CTCAGAGAAT TGCAGCCCTA 1260 

ACCAAGGCTQ AAGAGAGACA TGGAAATATT GAAOAACGTA TGAGACATTT AGAGGGTCAA 1320 

5 CTTGAAGAGA AGAATCAAGA ACTTCAAAGA GCTAGGCAAA GAGAGAAAAT GAATGAGGAG 1380 

CATAACAAGA GATTATCGGA TACGGTTGAT AGACTTCTGA CTGAATCCAA TGAACGCCTA 1440 

CAACTACACT TAAAGGAAAG AATGGCTGCT CTAGAAGAAA AGAATGTTTT AATTCAAGAA 1500 

TCAGAAACTT TCAGAAAGAA TCTTGAAGAA TCTTTACATG ATAAGGAAAG ATTAGCAGAA 1560 

GAAATTGAAA AGCTGAGATC TGAACTTGAC CAATTGAAAA TGAGAACTGG CTCTTTAATT 1620 

10 GAACCCACAA TACCAAGAAC TCATCTAGAC ACCTCAGCTG AGTTGCGGTA CTCAGTGGGA 1680 

TCCCTAGTGG ACAGCCAOTC TGATTACAGA ACAACTAAAG TAATAAGAAG ACCAAGGAGA 1740 

GGCCGCATGG GTGTGCGAAG AGATGAGCCA AAGGTGAAAT CTCTTGGGGA TCACGAGTGG 1800 

AATAGAACTC AACAGATTGG AGTACTAAGC AGCCAC CCTT TTGAAAGTGA CACTGAAATG 1860 

TCTGATATTG ATGATGATGA CAGAGAAACA ATTTTTAGCT CAATGGATCT TCTCTCTCCA 1920 

15 AGTGGTCATT CCGATGCCCA GACGCTAGCC ATGATGCTTC AGGAACAATT GGATGCCATC 1980 

AACAAAGAAA TCAGGCTAAT TCAGGAAGAA AAAGAATCTA CAGAGTTGCG TGCTGAAGAA 2040 

ATTGAAAATA GAGTGGCTAG TGTGAGCCTC GAAGGCCTGA ATTTGGCAAG GGTCCACCCA 2100 

GGTACCTCCA TTACTGCCTC TGTTACAGCT TCATCGCTGG CCAGTTCATC TCCCCCCAGT 2160 

GGACACTCAA CTCCAAAGCT CACCCCTCGA AGCCCTGCCA GGGAAATGGA TCGGATOGGA 2220 

20 GTCATGACAC TGCCAAGTGA TCTGAGGAAA CATCGGAGAA AGATTGCAGT TGTGGAAGAA 2280 

GATGGTCGAG AGGACAAAGC AACAATTAAA TGTGAAACTT CTCCTCCTCC TACCCCTAGA 2340 

GCCCTCAGAA TGACTCACAC TCTCCCTTCT TCCTACCACA ATGATGCTCG AAGTAGTTTA 2400 

TCTGTCTCTC TTGAGCCAGA AAGCCTCGGG CTTGGTAGTG CCAACAGCAG CCAAGACTCT 2460 

CTTCACAAAG CCCCCAAGAA GAAAGGAATC AAGTCTTCAA TAGGACGTTT GTTTGGTAAA 2520 

25 AAAGAAAAAG CTCGACTTGG GCAGCTCCGA GGCTTTATGG AGACTGAAGC TGCAGCTCAG 2580 

GAGTCCCTGG GGTTAGGCAA ACTCGGAACT CAAGCTGAGA AGGATCGAAG ACTAAAGAAA 2640 

AAGCATGAAC TTCTTGAAGA AGCTCGGAGA AAGGGATTAC CTTTTGCCCA GTGGGATGGG 2700 

CCAACTGTGG TCGCATGGCT AGAGCTTTGG TTGGGAATGC CTGCGTGGTA CGTGGCAGCC 2760 

TGCCGAGCCA ACGTGAAQAG TGGTGCCATC ATGTCTGCTT TATCTGACAC TGAGATCCAG 2820 

30 AGAGAAATTG GAATCAGCAA TCCACTGCAT CGCTTAAAAC TTCGATTAGC AATCCAGGAG 2880 

ATGGTTTCCC TAACAAGTCC TTCAGCTCCT CCAACATCTC GAACTCCTTC AGGCAACGTT 2940 

TGGGTGACTC ATGAAGAAAT GGAAAATCTT GCAGCTCCAG CAAAAACGAA AGAATCTGAG 3000 

GAAGGAAGCT GGGCCCAGTG TCCGGTTTTT CTACAGAOOC TGGCTTATGG AGATATGAAT 3060 

CATGAGTGGA TTGGAAATGA ATGGCTTCCC AGCTPGGGGT TACCTCAGTA CAGAAGTTAC 3120 

35 TTTATGGAAT GCTTGGTAGA TGCAAGAATG TTAGATCACC TAACAAAAAA AGATCTCCGT 3180 

GTCCATTTAA AAATGGTGGA TAGTTTCCAT CGAACAAGTT TACAATATGG AATTATGTGC 3240 

TTAAAGAGGT TGAATTATGA CAGAAAAGAA CTAGAAAGAA GACGGGAAGC AAGCCAACAT 3300 

GAAATAAAAG ACGTGTTGGT GTGGAGCAAT GACCGAATTA TTCGCTGGAT ACAAGCAATT 3360 

GGACTTCGAG AATATGCAAA TAATATACTT GAGAGCGGTG TGCATGGCTC ACTTATAGCC 3420 

40 CTGGATGAAA ACTTTGACTA CAGCAGCTTA ACTTTATTAT TACAGATTCC AACACAGAAC 3480 

ACCCAGGCAA GGCAGATTCT TGAAAGAGAA TACAATAACC TCTTGGCCCT GGGAACTGAA 3540 

AGGCGACTGG ATGAAAGTGA TGACAAGAAC TTCAGACGTG GATCAACCTG GAGAAGGCAG 3600 

T T T CC T OC T C GTGAAGTACA TGGAATCAGC ATGATGCCTG GGTCCTCAGA AACATTACCA 3660 

GCTGGATTTA GGTTAACCAC AAOCTCTGGG CAATCAAGAA AAATGACAAC AGATGTTGCT 3720 

45 TCATCAAGAC TGCAGAGGTT AGACAACTCC ACTGTTCGCA CATACTCATG TCTCGAGTAA 3780 
GOGGCCGCTT TAA 

S£Q IP N0338 PM28 Protein secuencg 
50 Protein Accession t. none found 

1 11 21 31 41 51 
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HMCEVMPTXN EDTPHSQRGS QSSGSDSDSH FEQLKVNMLD ERBRLLDTLR ETQESLSLAQ 60 

QRLQDVXYDR DSIiQRQLHSA LPQDIESLTG GLAGSKGAOP PEFAALTKEL NACREQLLEK 120 

BEEZSELKAE RNNTRLLLEH LECLVSRHER SL.RMTWKRQ AQSPSGVSSE VEVLKALKSL 180 

PEHHKALDBK VRERLRVSLE RVSALEEELA AANQEIVALR EQNVHIQRKM ASSEGSTESE 240 

HLEGMEPGQK VHEKRLSKGS IDSTOETSOI VELQELLEKQ NYEKAQMKER LAALSSRVGE 300 

60 VEQEAETARK DLIKTEEMNT KYQRDIREAM AQKEDMEERI TTLEKRYLSA QRESTSIHDM 360 

NDKLENELAN KEAILRQMEE KNRQLQERLE LAEQKLQQTM RKAETLPEVE AELAQRIAAB 420 

TKAEERHGNI EERHRHLEGQ LEZKNQELQR ARQREKMNEE HNKRLSDTVD RLLTESHERL 480 

QLHLKERMAA LEEKNVLIQE SETFRKNLEE SLHDKERLAE BZEKLRSELD QLKHRTGSLI 540 

EPTIPRTOLD TSAELRVSVG SLVDSQSDYR TTKVXRRPRR GRMGVRRDEP KVKSLGDHEW 600 

65 NRTQQIGVLS SHPFESDTEM SDIDDDDRET IFSSMDLLSP SGHSDAQTLA MHLQEQLDAI 660 

NKEZRLZQEE KESTBLRAEE IENRVASVSL EGLNLARVHP GTSITASVTA SSLASSSPPS 720 

GHSTPKLTPR SPAREMDRHG VMTLPSDLRK HRRK1AWEE DGREDKATXK CETSPPPTPR 780 

ALRHTHTLPS SYHNDARSSL SVSLEPBSLG LGSANSSQDS LBKAPKKKGZ KSSIGRLFGK 840 

KEKARLGQLR GFMETEAAAQ ESLGLGRLGT OAEKDRRLKK KHELLEEARR KGLPFAQWDG 900 

70 PTWAWLELW LGMPAWYVAA CRANVKSGAI MSALSDTEIQ REIGISNPLH RLKLRLAIQE 960 

MVSLTSPSAP PTSRTPSGNV WVTHEEMENL AAPAKTKESE EGSWAQCFVF LQTLAYGDMN 1020 

HEWIGNEWLP StGLPQYRSY FMECLVDARM LDHLTKKDLR VHLKMVDSFH RTSLQYGIKC 1080 

LKRLNYDRKE LERRREASQH EIKDVLVWSN DRIIRWIOAI GLREVANNIL ESGVHGSLIA 1140 

LDENFDYSSL TLLLQIPTQN TQARQILERE TONLLALGTE RRLDESDDKN FRRGSTWRRQ 1200 
75 FPPREVHGIS MMPGSSETLP AGFRLTTTSG QSRKMTTDVA SSRLQRLDNS TVRTYSCLE 
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ATGAGGCGAC TGAATCGGAA AAAAACTTTA AGTTTGGTAA AAGAGTTGGA TGCCTTTCCG 60 

5 AAGGTTCCTG AGAGCTATGT AGAGACTTCA GCCAOTGGA6 GTACAGTTTC TCTAATAGCA 120 

TTTACAACTA TGGCTTTATT AACCATAATG GAATTCTCAG TATATCAAGA TACATGGATG 180 

AAGTATGAAT ACGAAGTAGA CAAGGATTTT TCTAGCAAAT TAAGAATTAA TATAGATATT 240 

ACTGTTGCCA TGAAGTGTCA ATATGTTGGA GCGGATGTAT TGGATTTAGC AGAAACAATG 300 

A GTTGCATCTG CAGATGGTTT AGTTTATGAA CCAACAGTAT TTGATCTTTC ACCACAGCAG 360 

10 AAAGAGTGGC AGAGGATGCT GCAGCTGATT CAGAGTAGGC TACAAGAAGA GCATTCACTT 420 

CAAGATGTGA TATTTAAAAG TGCTTTTAAA AGTACATCAA CAGCTCTTCC ACCAAGAGAA 480 

GATGATTCAT CACAGTCTCC AAATGCATGC AGAATTCATG GCCATCTATA TGTCAATAAA 540 

GTAGCAGGGA ATTTTCACAT AACAGTGGGC AAGGCAATTC CACATCCTCG TGQTCATGCA 600 

CATTTGGCAG CACTTGTCAA CCATGAATCT TACAATTTTT CTCATAGAAT AGATCATTTG 660 

15 TCTTTTGGAQ AGCTTGTTCC AGCAATTATT AATCCTTTAG ATGGAACTGA AAAAATTGCT 720 

ATAGATCACA ACCAGATGTT CCAATATTTT ATTACAGTTG TGCCAACAAA ACTACATACA 780 

TATAAAATAT CAGCAGACAC CCATCAGTTT TCTGTGACAG AAAGGGAACG TATCATTAAC 840 

CATGCTGCAG GCAGCCATGG AGTCTCTGGG ATATTTATGA AATATGATCT CAGTTCTCTT 900 

ATGGTGACAG TTACTGAGGA GCACATGCCA TTCTGGCAGT TTTTTGTAAG ACTCTGTGGT 960 

2\J ATTGTTGGAG GAATCTTTTC AACAACAGGC ATGTTACATG GAATTGGAAA ATTTATAGTT 1020 

GAAATAATTT GCTGTCGTTT CAGACTTGGA TCCTATAAAC CTGTCAATTC TGTTCCTTTT 1080 
GAGGATGGCC ACACAGACAA CCACTTACCT CTTTTAGAAA ATAATACACA TTGA 

25 SEQ tD NO:240 PCM Protein sequence: 

Protein Accession*: NP 057654 

1 11 21 31 41 51 
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D\J MRRLNRKKTL SLVKELDAFP KVPESYVETS ASGGTVSLIA FTTOALLTIM EFSVYQDTKM 60 

- KYBYBVDKDF SSKLRBUDI TVAMKCQYVG ADVLDLAETM VASADGLVYE PTVFDLSPQQ 120 

KEWQRMLQLI QSKLQEEHSL QDVIFKSAFK STSTALPPRE DDSSQSPNAC RIHGHLYVNK 180 

VAGNFHITVG KAIPHPRGHA HLAALVNHES YNFSHRIDHL SFGELVPAII NPLDGTBKIA 240 

IDHNQMFQYF ITWPTKLHT VKISADTHQF SVTERERIIN HAAGSHGVSG IFMKYDLSSL 300 

35 HVTVTEEHMP FWQFFVRLCG IVGGIFSTTG MLHGIGKFIV EIICCRFRLG SYKPVNSVPF 360 



SEQ©N0241 PBA7DNA SEQUENCE 

Ar . NucJeteAcW Accession*: AA219134 

40 Coding sequence: 24-1815 (undeifined sequences correspond to start and stop codons) 

AATTCGCCCT TGCTTAATTA AGCATQTTTA CCTTCCTCTC ATCTGTCACT GCTGCTGTCA 60 

GTGGCCTCCT GGTGGGTTAT GAACTTGGGA TCATCTCTGG GGCTCTTCTT CAGATCAAAA 120 
45 CCTTATTAGC CCTGAGCTGC CATGAGCAGG AAATGGTTGT GAGCTCCCTC GTCATTGGAG 180 

OCCTOCTTGC CTCACTCACC GO AGGGGTCC TGATAGACAG ATATGGAAGA AGGACAGCAA 240 

TCATCTTGTC ATOCTGCCTG CTTGGACTCG Q AAGCTTAGT CTTG ATCCTC AGTTTATOCT 300 

ACAOGGTTCT T ATAGTGGG A CGCATTGCCA TAGGGGTTTC CATCICCCTC TCTTOCATTG 360 
_ CCACrili HUT TTACATOGCA GAGATTOCTC CTCAACACAG AAGAGGCCTT CTTGTGTCAC 420 
50 TGAATGAGCT GATGATTGTC ATCGGCATTC TTTCTGCCTA TATTTCAAAT TACGCATTTG 480 

CCA ATOTTTT OCATGGCTGG AAGTACATGT TTGGTCTTGT GATTCCCTTG GGA OTTTT0 C 540 

AAGCAATTGC AATGTATTTT CTTCCTCCAA GCCCTCGGTT TCTGGTGATG AAAGGACAAG 600 

AGG G AGCTGC TAGCAAGGTT CTTGG AAGGT TAAGAGCACT CTCAGATACA ACTGAGG AAC 660 

TCACTGTC AT CAAATOCTCC CTGAAAGATG AATATCAGTA CAGTITTTGO GA1C1U1T1C 720 
55 GTTCAAAAGA CAACATGCGG ACOOGAATAA TGATAGGACT AACACTAGTA TnTCTGTAC 780 

AAATCACTGG CCAACCAAAC ATATTGTTCT ATGCATCAAC TGTTTTGAAG TCAGTTGGAT 840 

TTCAAAGCAA TGAGGCAGCT AGCCTCGCCT GCACTGGGGT TGGAGTCGTC AAGGTCATTA 900 

GCACCATCCC TGCCACTCTT CTTGTAGACC ATGTCGGCAG CAAAACATTC CTCTGCATTG 960 

GCTCCTCTGT GATGGCAGCT TCGTTGGTGA CCATGGGCAT CGTAAATCTC AACATCCACA 1020 
60 TGAACTTCAC OCATATCTGC AGAAGCCACA ATTCTATCAA CCAGTCCTTG GATGAGTCTG 1080 

TGATTTATGG ACCAGGAAAC CTGTCAACCA ACAACAATAC TCTCAGAGAC CACTTCAAAG 1140 

GGATTTCTTC GCATAGCAGA AGCTCACTCA TGCCCCTGAG AAATGATGTG GATAAGAGAG 1200 

GGGAGACGAC CTCAGCATCC TTGCTAAATG CTGGATTAAG CCACACTGAA TACCAGATAG 1260 

TCACAGACCC TGGGGACGTC CX^AGCTTTTT TGAAATGGCT GTOCTTAGCC AGCTTGCTTG 1320 
65 TTTATGTTGC TGCTTTTTCA ATTGGTCTAG G ACCAATGCC CTGGCTGGTG CTCAGCGAG A 1380 

TCTTTOCTGG TGGG ATCAG A GG ACGAGCCA TGGCTTTAAC TTCTAGCATG AACTGGGGCA 1440 

TCAATCTCCT CATCTCGCTG ACAl'l 111GA CTGTAACTGA TCTTATTGGC CTGCCATGGG 1500 

TGTGCTTTAT ATATACAATC ATGAGTCTAG ATCTTATTGG CCTGCCATGG GTGTGCTTTA 1560 

TATATACAAT CATGAGTCTA GCATCCCTGC TTTTTGTTGT TATGTTTATA CCTGAGACAA 1620 
70 AGGGATGCTC TTTGGAACAA ATATCAATGG AGCTAGCAAA AGTGAACTAT GTGAAAAACA 1680 

ACA l ' l 101 11 TATQ AGTCAT CACCAAGAAG AATTAGTGCC AAAACAGCCT CAAAAAAGAA 1740 

AACCCCAGGA GCAGCTCTTG GAGTGTAACA AGCTGTGTGG TAGGGGCCAA TCCAGGCAGC 1800 

TTTCTCCAGA GACCTAATGG CCTCAACACC TTCTGAACGT GGATAGTGCC AGAACACTTA 1860 

GGAGGGTGTC TTTGG ACXZAA TGCATAGTTG CGACTCCTGT GCTCTCTTTT CAGTGTCATG 1920 
75 GAACTGGTTTTGAAGAGACA CTCTGAAATG ATAAAGACAG CCTTTAATCC CCCTCCTCMC 1980 

CAGAAGGAAC CTCAAAAGGT AGATG AGGTA CAAGGTCCTA AGTGATCTCT TTTTCTGAGC 2040 

AGGATATCAG GTTAAAAAAA AAAAGTTACT GGCTGGnTA ATACTTTCTA CCTTCTTCAC 2100 

AG AGCAGCCT TTGAATAG AC TATGTCCTAG TGAAG ACATC AACCTCCGCC TTAAGCTATG 2160 

TATGTATGGA GGCCAGTCGC AGCTTTATTA TGCAGACACA CAAGTGGTCT GGACATGAGG 2220 
80 GTACAGTTTC TGCCTACCAA GACACTACTT GCACTGGATC TTACGCAAAA AAGAAOCAGA 2280 
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ACACACAGTG TGGACAACTG CCCATATATT CTATCTAGAT TAGGAGAQGG TCCTGGCTAG 2340 
GATTTTAGTG GTAATTCCTA GTTACATTCA ACAAGTATAA AGATTATAGA GCTTATTTTA 2400 
TGAACTATAA ACTATAATTT AATGCAAAAT ATCCTTTTAT GAATTTCATG TTAATATTGT 2460 
GAAATATTAA AATAATTCCR CAATAGTTGA GAAAAATGAG CATTTTTTTC CAIIIUAAA 2520 
AAATGCATAG AAAAG ACAAT TTTAAAATOC TGGGACCATA TTTATTTAGA AGTAGCTGTT 2580 
AGTAAAACAT TAGAAAAGGA GTCAGGCCAT TAGGTTATTT ATCCAAATCT CTAAGCAATT 2640 
AGGTTGAAGT TATTAAGTCA AGCCTAGAAA AGCTGCCTCC TTGTAAGGCT TTCATGACAA 2700 
TGTATAGTAA TOCACAGTGT CCAATTCTTC ACACTCCTCA GGAATATCAC TACCTCAGGT 2760 
TACGGTACAC AGGCTATAAT TGATGATGAT GTTCAGATAA CTGAAGACAC AATAAATGAC 2820 
ATFCAGACAT CAGGAMAAWW CCCTCATGTT CTTTTCTATG ATGGCCACCT GTACCAGCAA 2880 
CGTGGGTTTC ACCCACACAA CGATGAACTG TTCTCTTACT TCTCCAGTTG ATTTTAAAGA 2940 
CTTGTTAAGA GGTCTTACTA ATAAAATTTG GGTATGATAG AAAAWCCACA ATCAAAWCTT 3000 
GAACCAAATA ACATATTAAA TTACTAATAT TTAAGTGATG GAAG ACACAC AAAAAACTTA 3060 
AAAGCACGAA CAACCTAACT TG AAAAAGAA TTTTAAAATA TGATTAACCT GAAGAAAAGA 3120 
GAATCCTAAG AGCCAAAGCT CCTTTTTATT TAGCTTGGAA TTTTCCTATT GGTTCCTAAC 3180 
AAACTGTCCC AATGTCATAT AAGG AAACAT G ATCTATTAC ATTCCTTTAT AACAATGTGG 3240 
AGAGACTATA AACCTATGTA AGTAGTAAAA CTATATYAGA GACTCAGGAG ACTGACTAAA 3300 
AGGCCTGGAT CTGCAGTGTA TTATCTGTAT AAAAATTGGC AGGGGGAAGC TAAAAGGAAA 3360 
GGAGATTGGA GATCTCAATT CTATCATGGT GTATTTCATA CGCAAATCAG AGCATGCATT 3420 
GTrTTTTGTT TTTGGAAAGA GAAGGGAAGT GTGTTCTGCC CCATGTTTCC 7TCCGTGTCT 3480 
ATAGTTCAAA CTCTATATAT ACTTCAGGTA TTTTTTGTTT AGCCCTTCAT TATAAATGGG 3540 
CAGG AAATTG TTTATCAACC TAGCCAGTTT ATTACTAGTG AOCTTGACTT CAGTATCTTG 3600 
AGCATTC1TT TATATTTTTC TTTTATTATC CTGAGTCTGT AACTAAACAA TTTTGTCTTC 3660 
AAATTTTTAT CCAATATCCA TTGCACCACA CCAAATCAAG CTTCTTGATT TTCAAAAATA 3720 
AAAAGGGGG A AATACTTACA ACTTGTACAT ATATATTCAC AGTTTTTATT TATAAAAAAA 3780 
ATTTACAGTA CTTATGGAGA GCCAGCAG AA GACATCAGAG CACTCACTIC TTCCCATCTT 3840 
TGTTAAGGTT AGCGAATTAC CCATGGACAC TGTTAGGTGA GGCTCATTCG GCAOCCCTGA 3900 
AAACAAACCT GGTCACACTG TCTTTACCCT CTCCCTTCAG ATAAAGCACT TCGATTATCT 3960 
ATTGATCTGC CCAGTTTTCA AGTCATGCGA ATACTAAAAA GGTTACATCA TCTGGATCTG 4020 
TACCTTGGCT ATATAAGCAT GTTTTCCCCC TATTCTATGT TTCTTTTTTT GGTO AACATT 4080 
GAAAAACAGG AGGTGACTTA TTACTGTTAA TTAAAACTAA ATGAAAAATG TCAAGTCTTT 4140 
AAAACAGTGA GCTTGTAACT CTTTCATGTA ATTTTATTCT CTATGAATTT GGCTATDCTA 4200 
CTGAATCTTA AAATAAAGGA AATAAACACT TTTTTTTWAA AAAAAAGGAA AAATAMAARW 4260 
MWAAAAATCT CAATGAAATA TTTCACAAGA AGGAAAAA 



Protein Accession*: AAB1431 

MFTFLSSVTA AVSGLLVGYE LGIISGALLQ IKTLLALSCH EQEMWSSLV IGALLASLTG 60 
GVIJDRYGRR TAIILSSCLL GLGSLVULS LSVTVLIVGR IAIGVSBLS SIATCVYIAE 120 
IAPQHRRGLL VSLNELMT/I GILS AY1SNY AFANVFHGWK YMFGLVIPLG VLQAIAMYFL 180 
PPSPRFLVMK GQEGAASKVL GRLRALSDTT EELTVIKSSL KDEYQYSFWD LFRSKDNMRT 240 
RIMIGLTLVF FVQITGQPNI LFYASTVLKS VGFQSNEAAS LASTGVGWK YBTIPATLL 300 
VDHVGSKTFL QGSSVMAAS LVTMGIVNLN IHMNFTHICR SHNSINQSLD ESVIYGPGNL 360 
STNNNTLRDH FKGISSHSRS SLMPLRNDVD KRGETTSASL LNAGLSHTEY QIVTDPGD VP 420 
AFLKWLSLAS LLVYVAAFSI GLGPMPWLVL SEIFPGGIRG RAMALTSSMN WGINLUSLT 480 
FLTVTDLIGL PWVCHYTIM SLDLIGLPWV CHYTTMSLA SLLFWMHP ETKGCSLEQI 540 
SMBLAKVNYV KNNICFMSHH QEELVPKQPQ KRKPQEQLLE CNKLCGRGQS RQLSPET 



BEQ ID NO:243 PAB4 DMA sequence: 
Nudeic Add Accession* AA17Z056 



TTTAGCCACC AGAGGANTTC TCTTG AAATA CCCAAAATCC ATCAGTATCT TGAATCATGC 60 
TGGATTTTG A AGAATTCTT A AG AAGCCATG TAAAGGGGGC TCTCTGGOCT TG AAATAGTG 120 
ATGTTTTTTA TACAG AAAGG AGAATGCAGA ATGGTCAGAC TATCATGCAC TGTTAAATTT 180 
GATTTCAAGA AATTACAGGA AAACTTTCCA AAGTTCCATC TCACAGAANN TTATTTTNCC 240 
AAGAATTCCA AG ATAAGTTT AGTTTTATGG AAGACTTTTA TGTGGTTTTT ACTCACTCTT 300 
CATCTCAGAC ATCGACAGAT GATTACATCA CTTAIAGTTC TAGTAAATTT ATTAATATAA 360 
AACTCAG AG A C ATTCCAATA TCCACATTGC TTACAOCATT AGGCATAGAT TCAGTGTCAG 420 
CTATGACAAT TGAAAATGAG CTGTTTTGTG ATTTAAAGGT TTAAATTTCT CTAACCAAAC 480 
TGCTTGATCC AGATGCAGGA CTGCAAATGT TAATAT1TGT TCTGGAAGAA CAATCAAATA 540 
AGACTTAAGA GGAAAGGGAA TGGCCACAAT CCACCTGAAA TTTTTTCTTA AAAAGTGTGC 600 
AGCCTACTAA ATCAGAATGA AAATAGAAGT ACAAGATTAT AAACAAAATG CAATCAAACT 660 
TTTCTTAAGC TTACCTAAAG TTATTTCATC TGAAAATTTC AAGCAACTTT GTTCAACATT 720 
AAATTGACAA TCTAAACTAA CAAGTCTTTT GAATTTATGC ATGGTAGTAA ACATTCTCTC 780 
TATTAACTTT ATTACCTAAG GCTAAACCTA AAATTTTTAA GCAAAATTAG AAAAATAGTC 840 
TTCACTCATC AAAAAATAAA GTTTGTTACA TTTAGTATTT TCCCAAT AAA ATTGGTCGTT 900 
CTTGGTTTTT T ATTTGG AG A GTCTGTGCAA AATGTCACTA AAAATAAATT AGCACTAGAA 960 
ATTATTTCTA AATACCAAA 
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MATGGCGTG CCCGTCTCTC CGCCGGCCCC CTGCCTCGCA GTGGTTTCTC CTGCAGCTCC 60 
OCTGGGCTCC GCGGCCAGTA GTGCAGCCCG TGGAGCCGCG 6CTTTGCCCG TCTCCTCTGQ 120 
GttSGOCCCAG TGCGCGGGCT GACACTCATT CAGCCGGGGA AGGTGAGGCG AGTAGAGGCT 180 
GGTGCGGAAC TTGCCGCCCC CAGCAGCGCC GGCGGGCTAA GCCCAGGGCC GGGCAGACAA 240 
AAGAGQCCGC CCGCOTAGGA AGGCACGCCC GGCGGCGGCG GAGCGCAGCG ATGGCCGGGC 300 
GA6GGG6CAG CGCGCTGCTG GCTCTGTGCG GGGCACTGGC TGCCTGCGGG TGGCTCCT66 360 
GCGCCGAAGC CCAGGAGCCC GGGGCGCCCG CGGCGGGCAT GAGGCGGOOC CGGCOGCTGC 420 
AGCAAGAGGA OGGCATCTCC TTCGAGTACC ACCGCTACCC CGAGCTGCGC GAGGCGCTCG 460 
TGTCCGTGTG GCTGCAGTGC ACCGCCATCA GCAGGATTTA CACGGTGGGG CGCAGCTTCG 540 
AGGGCCGGGA GCTOCTGGTC ATCGAGCTGT CCGACAACCC TGGCGTCCAT GAGCCTGGTG 600 
AGCCTGAATT TAAATACATT GGGAATATGC ATGGGAATGA GGCTGTTGGA CGAGAACTGC €60 
TCATTTTCTT GGCCCAGTAC CTATGCAACG AATACCAGAA GGGGAACGAG ACAATTGTCA 720 
ACCTGATCCA CAGTACCCGC ATTCACATCA TGCCTTCCCT GAACCCAGAT GGCTTTGAGA 780 
AGGCAGCGTC TCAGCCTGGT GAACTC AAGG ACTGGTTTGT GGGTCGAAGC AATGCCCAGG 840 
GAATAGATCT GAACCGGAAC TTTCCAGACC TGGATAGGAT AGTGTACGTG AATGAGAAAG 900 
AAGGTGGTCC AAATAATCAT CTGTTGAAAA ATATGAAGAA AATTGTGGAT CAAAACACAA 960 

AGCTTGCTCC TGAGACCAAG GCTGTCATTC ATTGGATTAT GGATATTOCT TTTGTGCTTT 1020 

CTGCCAATCT CCATGGAGGA GACCTTGTGG CCAATTATCC ATATGATGAG ACGCGGAGTG 1080 

GTAGTGCTCA CGAATACAGC TCCTOCCCAG ATGACGCCAT TTTCCAAAGC TTGGOCOGGG 1140 

CATACTCTTC TTTCAACCCG GCCATGTCTG ACCCCAATCG GCCACCATGT CGCAAGAATG 1200 

ATGATGACAG CAGCTTTGTA GATGGAACCA CCAACGGTGG TGCTTGGTAC AGCGTACCTG 1260 

GAGGGATGCA AGACTTCAAT TACCTTAGCA GCAACTGTTT TGAGATCACC GTGGAGCTTA 1320 

GCTGTGAGAA GTTCCCACCT GAAGAGACTC TGAAGACCTA CTGGGAGGAT AACAAAAACT 1380 

CCCTCATTAG CTACCTTGAG CAGATACACC GAGGAGTTAA AGGATTTGTC CGAGACCTTC 1440 

AAGGTAACCC AATTGCGAAT GCCACCATCT CCGTGGAAGG AATAGACCAC GATGTTACAT 1500 

CCGCAAAGGA TGGTGATTAC TGGAGATTGC TTATACCTGG AAACTATAAA CTTACAGCCT 1560 

CAGCTCCAGG CTATCTGGCA ATAACAAAGA AAGTGGCAGT TCCTTACAGC CCTG C TGCT G 1620 

GGGTTGATTT TGAACTGGAG TCATTTTCTG AAAGGAAAGA AGAGGAGAAG GAAGAATTGA 1680 

TGGAATCGTG GAAAATGATG TCAGAAACTT TAAATTT TTA AA AAGQCTTC TAGTTAGCTG 1740 

CTTTAAATCT ATCTATATAA TGTAGTATGA TGTAATGTGG TCT TT TTTTT AGATTTTGTG 1800 

CAGTTAATAC TTAACATTGA TTTATTTTTT AATCATTTAA ATATTAATCA ACTTTCCTTA 1860 

AAATAAATAG OCTCTTAGGT AAAAATATAA GAACTTGATA TATTTCATTC TCTTATATAG 1920 

TATTCATTTT CCTACCTATA TTACACAAAA AAGTATAGAA AAGATTTAAG TAATTTTGCC 1980 

ATCCTAGGCT TAAATGCAAT ATTCCTGGTA TTATTTACAA TGCAGAATTT TTTGAGTAAT 2040 

TCTAGCTTTC AAAAATTAGT GAAGTTCTTT TACTGTAATT GGTGACAATG TCACATAATG 2100 

AATGCTATTG AAAAGGTTAA CAGATACAGC TCGGAGTTGT GAGCACTCTA CTGCAAGACT 2160 

TAAATAGTTC AGTATAAATT GTCGTTTTTT TCTTGTGCTG ACTAACTATA AGCATGATCT 2220 

TGTTAATGCA TTTTTGATGG GAAGAAAAGG TACATGTTTA CAAAGAGGTT TTATGAAAAG 2280 

AATAAAAATT GACTTCTTGC TTGTACATAT AGGAGCAATA CTATTATATT ATGTAGTCCG 2340 

TTAACACTAC TTAAAAGTTT AGGGTTTTCT CTTGGTTGTA GAGTGGCCCA GAATTGCATT 2400 
CTGAATGAAT AAAGGTTAAA AAAAAATCCC CAGTGAAAAA AAA 

Protein Accession P16870 

MAGRGGSALL ALCGALAACG WLLGAEAQEP GAPAAGMRRR RRLQQEDGIS FEYHRYPELR 60 

EALVSVWLQC TAISRIYTVG RSFEGRELLV IELSDNPGVH EPGEPEFKYI GNMHGNEAVG 120 
REUUFLAQY LCNEYQKGNE TTVNUHSTR IHIMPSLNPD GFEKAASQPG ELKDWFVGRS 180 
NAQGIDLNRN FPDLDRIVYV NEKBGGPNNH LLKNMKKIVD QNTKLAPETK AVIHWIMDIP 240 

FVLSANLHGG DLV ANYPYDE TRS GSAHEYS SSPDDAIFQS LARAYSSFNP AMSDPNRPPC 300 
RKNDDDSSFV DGTTNGGAWY SVPGGMQDFN YLSSNCFHT VELSCEKH7EBTLKTYWED 360 

NKNSLBY1E QIHRGVKGFV RDLQGNPIAN ATTS VE GIDH DVTSAKDGDY WRLUPGNYK 420 
LTASAPGYLA ITKKVAVPYS PAAGVDFELE SFSERKEEEK EELMEWWKMM SETLNF 



Nucleic Acfc Accession* AF038966 

Coding sequence: 91-1107 {urtfejiined sequence corresponds to start and stop codon) 

1 11 21 31 41 51 

I I I I I I 

GGGGCGACGT GAGCGCGCAG GGGGGCGGCG GCCTCGCCTC GTCTCTCTCT CTGCGCCTGG 60 

GTCGGGTGGG TGACGCCGAG AGCCAGAGAG ATGTCGGATT TCGACAGTAA CCCGTTTGCC 120 

GACCCGGATC TCAACAATCC CTTCAAGGAT CCATCAGTTA CACAAGTGAC AAGAAATGTT 180 

CCACCAGGAC TTGATGAATA TAATCCATTC TCGGATTCTA GAACACCTCC ACCAGGCGGT 240 

GTGAAGATGC CTAATGTACC CAATACACAA CCAGCAATAA TGAAACCAAC AGAGGAACAT 300 

CCAGCTTATA CACAGATTGC AAAGGAACAT GCATTGGCCC AAGCTGAACT TCTTAAGCGC 360 

CAAGAAGAAC TAGAAAGAAA AGOCGCAGAA TTAGATCGTC GGGAACGAGA AATGCAAAAC 420 

CTCAGTCAAC ATGGTAGAAA AAATATTTGG CCACCTCTTC CTAGCAATTT TCCTGTCGGA 480 

CCTTGTTTCT ATCAGGAATT TTCTGTAGAC ATTCCTGTAG AATTCCAAAA GACAGTAAAG 540 

CTTATGTACT ACTTGTGGAT GTTCCATGCA GTAACACTGT TTCTAAATAT CTTCGGATGC 600 

TTGGCTTGGT TTTGTGTTGA TTCTGCAAGA GCGGTTGATT TTGGATTGAG TATCCTGTGG 660 

TTCTTGCTTT TTACTCCTTG TTCATTTGTC TGTTGGTACA GACCACTTTA TGGAGCTTTC 720 

AGGAGTGACA GTTCATTTAG ATTCTTTGTA TTCTTCTTOG TCTATATTTG TC AGTT TGCT 780 

GTACATGTAC TCCAAGCTGC AGGATTTCAT AACTGGGGCA ATTGTGGTTG GATTTCATCC 840 

CTTACTGGTC TCAACCAAAA TATTCCTGTT GGAATCATGA TGATAATCAT AGCAGCACTT 900 

TTCACAGCAT CAGCAGTCAT CTCACTAGTT ATGTTCAAAA AAGTACATGG ACTATATCGC 960 

ACAACAGGTG CTAGTTTTGA GAAGGCCCAA CAGGAGTTTG CAACAGGTGT GATGTCCAAC 1020 

AAAACTGTCC AGACCGCAGC TGCAAATGCA GCTTCAACTG CAGCATCTAG TGCAGCTCAG 1080 
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AATGCTTTCA AGGGTAACCA GATTTAAGAA TCTTCAAACA ATACACTGTT ACCTTTTGAC 1140 

TGTACCTTTT TCTCCAGTTA CTGTATTCTA CAAATATTTT TATGTTCAAA ACACACAGTA 1200 

CAGACAGCAT GGATATTTCC TGTTCACTTG TGCATGGGCT AAAACCAGGA AAACTTCCTT 1260 

OTCTTATTAC TTTACCTAAT AGTTTCTTAA TATTTCAGTG CCCCTTGCAG AAAAAATATT 1320 

ACATGCTAAA TAAATATTCT CCATATTTTT GGGGGATGAC ATTCAGTGAA TTATTTCAGT 1360 

GGTGACCCAC TGAAAATTAA TAATGGTACT TATGATTAAA AACGCATTTA ATACTAACTG 1440 

CAGTAGTTCT TTCAAGAATC TTTAGAGATA AGGATTGCAC ATTGGAAAAG TAAACCATGT 1500 

TTCATTCCTT TTTCCCTATT TATATTGAAA GAAATAGGCC AGCAGAGACT TAGGGATTTT 1560 

AAATTGGCTT GCTTTTTAGC TGTTTCAGTC ACCAGTGAAG AGCCTATGTG CATTTTGTAG 1620 

TAGATAATGT AAAATTTGTC ATCTTTTTCT TTTCTTTTTT TTAGAATAGC TGATATTTTG 1680 

ATAACAATCT CTAATTTGCA TGGGCACCAC ATTTCTTATA TTAAAAGAAT TAGTGTTTTG 1740 

GCTTCTGTAC TGCTTATGGT TGTAGGATTC AGGGGTTAAT GGAATCACAG AAATGATATT 1800 

CTGCAAGAAT TTCTTTTAAA TAAAAAGTTT GGGGGTGCAA TATAAGAAGT 1TATATAATA 1860 

TGCAGTACAT TATCCAAAAG AGAAGGTAGT TAATGCAGTA GAAAGTAGTG GTAATAATTC 1920 
CTTTTT 



$EQIPNfr247PBY4 
Protein Accession #: 

MSDFDSNPFA DPDLNNPFKD PSVTQVTRNV PPGLDEYNPF SDSRTPPPGG VKMPNVPNTQ 60 
PAIMKPTEEH PAYTQIAKEH ALAQAELLKR QEELERKAAE LDRREREMQN LSQHGRKNIW 120 
PPLPSNFPVG PCFYQEFSVD IPVEFQKTVK LMYYLWMFHA VTLFLNIFGC LAWFCVDSAR 180 
AVDFGLSELW FLLFTPCSFV CWYRPLYGAF RSDSSFRFFV FFFVYICQFA VHVLQAAGFH 240 
NWGNCGWISS LTGLNQNIFV GIMMMAAL FTASAVISLV MFKKVHGLYR TTGASFEKAQ 300 
QEFATGVMSN KTVQTAAANA ASTAASSAAQ NAFKGNQI 



Nucleic Acid Accessions none found 

Coding sequence: 1-613 (underlined sequence corresponds to start and stop codon) 



ATOAGAQACA ATAAATCGTG TGCTTTTTTC ATGGGAAAGT TAAATGTTTG TTTTGAAGGC 60 
ACAGTAATAG CAGGCTATTC AGTGTTTGCC ACTACCTGCA TCATTCATCT GGCTGTAGCT 120 
AGTGCACTAC AATTTCCTAA AAAGTCTTCT CACCCTCACA GGACTGCTCT ACATCTGGCC 180 
TCTGCCAATG GAAATTCAGA AGTAGTAAAA CTCCTGCTGG ACAG ACGATG TCAACTTAAT 240 
ATCCTTGACA ACAAAAAGAG GACAGCTCTG ACAAAGGCCG TACAATGCCA GGAAGATGAA 300 
TGTGCGTTAA TGTTGCTGGA ACATGGCACT GATCCGAATA TTCCAGATGA GTATGGAAAT 360 
ACCGCTCTAC ACTATGCTAT CTACAATGAA GATAAATTAA TGGCCAAAGC ACTGCTCTTA 420 
TACGGTGCTG ATATCGAATC AAAAAACAAG CATGGCCTCA CACCACTGTT ACTTGGTGTA 480 
CATGAGCAAA AACAGCAAGT GGTGAAATTT TTAATCAAGA AAAAAGCAAA TTTAAATGCA 540 
CTGGATAGAT ATGGAAGGTG TGTGACCTTG GGAACGTTAT TTACCACCAA ATATGTTGTC 600 
ATATATGAAA AGTAG 



§eq ro N0349 ppHg Proton Mercs 
Protein Accession ft none found 

MRDNKSCAFF MGKLNVCFEG TVIAGYSVFA TTCIIHLAVA SALQFPKKSS HPHRTALHLA 60 
SANGNSEWK LLLDRRCQLN HDNKKRTAL TKAVQCQEDE CALMLLEHGT DPNIPDEYGN 120 
TALHYAIYNE DKLMAKALLL YG ADIESKNK HGLTPLLLG V HEQKQQWKF UKKKANLNA 180 
LDRYGRCVTL GTLFTTKYW IYEK 



SEP ID WO:250 PBJ1 DN A sequence 
Nucleic Acid Accession*: XM_0O5829 

Coding sequence: 1 -3043 (underlined sequence corresponds to start and stop codon) 

ATGGTGATCA TCTATCTTTC TTTCTGCAAT TATTACATGG AGTTCTACAG AGAAGAGCTT 60 
CCCCACATTG ACTATTTGAT TGACATTCAG TTTGCAACAG GAAAGGTTAC TCAGCCGGGA 120 
GAGGACACTT CCTACCATCA ATGCGCTCAG CTTGAAGCCA GAGACGAAGG CACCGACAGT 180 
TTATTATTAA ACAATGGCAG CAGCGCCACG CTGAAGACAC GAACGCGCTG TTATGGAACC 240 
CCCAGAGGTC TCCCCCATCG TAGCCTGCTC CAGCCGACTC CGCCCACATG TAAAACGAAG 300 
ATCAGGAGCA GATTTGAAGA ATTACAA AGT GAATTGGTGC CAGTCAGCAT GTCAGAGACA 360 
GACCACATAG CCTCTACTTC CTCTGATAAA AATGTTGGGA AAACACCTGA ATTAAAGGAA 420 
GACTCATGCA ACTTGTTTTC TGGCAATGAA AGC AGCAAAT TAG AA AATG A GTCCAAACTA 480 
TTGTCATTAA ACACTGATAA AACTTTATGT CAACCTAATG AGCATAATAA TCGAATTGAA 540 
GCCCAGGAAA ATTATATTCC AGATCATGGT GGAGGTGAGG ATTCTTGTGC CAAAACAGAC 600 
ACAGGCTCAG A AAATTCTG A ACA AATAGCT AATTTTCCTA GTGGAAATTT TGCTAAACAT 660 
ATTTCAAAAA CAAATGAAAC AGAACAGAAA GTAACACAAA TATTGGTGGA ATTAAGGTCA 720 
TCTACATTTC CAGAATCAGC TAATGAAAAG ACTTATTCAG AAAGCCCCTA TGATACAGAC 780 
TGCACCAAGA AATTTATTTC AAAAATAAAG AGCGTTTCAG CATCAGAGGA TTTGTTGGAA 840 
GAAATAGAAT CTGAGCTCTT ATCTACGGAG TTTGCAGAAC ATCGAGTACC AAATGGAATG 900 
AATAAGGGAG AACATGCATT AGTTCTGTTT GAAAAGTGTG TGCAAGATAA ATATTTGCAG 960 
CAGGAACATA TCATAAAAAA GTTAATTAAA GAAAATAAGA AGCATCAGGA GCTCTTCGTA 1020 
GACATTTGTT CAGAAAAAGA CAATTTAAGA GAAGAACTAA AGAAAAGAAC AGAAACTGAG 1080 
AAGCAGCATA TGAACACAAT TAAACAGTTA GAATCAAGAA TAGAAGAACT TAATAAAGAA 1 140 
GTTAAAGCTT CCAGAGATCA ACTAATAGCT CAAGACGTTA CAGCTAAAAA TGCAGTTCAG 1200 
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CAGTTACACA AAGAGATGGC CCAACGGATG GAACAGGCCA ACAAGAAATQ TOAAGAOGCA 1260 
OGCCAAGAAA AAG AAGCAAT GGTAATG AAA TATGTAAGAG GTGAG AAGGA ATCTTTAG AT 1320 
CTTCGAAAGG AAAAAGAG AC ACTFG AGAAA AAACTTAG AG ATGCAAATAA GG AACTTG AG 1380 
AAAAACACTA ACAAAATTAA GCAGCTTTCT CAGGAGAAAG GACGGTTGCA CCAGCTGTAT 1440 
GAAACTAAGG AAGGCGAAAC GACTAGACTC ATCAGAG AAA TAGACAAATT AAAGGAAGAC 1500 
ATTAACTCTC ACGTCATCAA AGTAAAGTGG GCACAAAACA AATTAAAAGC TGAAATGGAT 1560 
TCACACAAGG AAACCAAAGA TAAACTCAAA GAAACAACAA CAAAATTAAC ACAAGCAAAG 1620 
GAAGAAGCAG ATCAGATACG AAAAAACTGT CAGG ATATGA TAAAAACATA TCAGGAGTCA 1680 
GAAGAAATTA AATCAAATG A GCTTGATGCA AAGCTTAG AG TCACAAAAGG AGAACTTG AA 1740 
AAACAAATGC AAGAAAAATC TG ACCAGCTA GAGATGCATC ATGCCAAAAT AAAGGAACTA 1800 
G AAGATCTG A AG AG AACATT TAAGGAGGGT ATGG ATGAGT TAAGAACACT GAGAACAAAG 1860 
GTG AAATGTC TAGAAG ATG A ACGATTAAGA ACAGAAGATG AATTATCAAA ATATAAGGAA 1920 
ATTATTAATC GCCAAAAAGC TGAAATTCAG AATTTATTGG ACAAGGTGAA AACTGCAGAT 1980 
CAGCTACAGG AGCAGCTTCA AAGAGGTAAG CAAG AAATTG AAAATTTGAA AGAAGAAGTG 2040 
GAAAGTCTTA ATTCTTTGAT TAATGACCTA CAAAAAGACA TOGAAGGCAG TAGGAAAAGA 2100 
GAATCTGAGC TGCTGCTGTT TACAGAAAGG CTCACTAGTA AGAATGCACA GCTTCAGTCT 2160 
GAATCCAATT CTTTGCAGTC ACAATTTGAT AAAGTTTCCT GTAGTG AAAG TCAGTTACAA 2220 
AGCCAGTGTG AACAAATGAA ACAGACAAAT ATTAATTTGG AAAGTAGGTT GTTGAAAGAG 2280 
GAAGAACTGC G AAAAGAGG A AGTOCAAACT CTGCAAGCTG AACTCGCTTG TAGACAAACA 2340 
GAAGTTAAAG CATTG AGTAC GCAGGTAGAA G AATTAAAAG ATGAGTTAGT AACTCAGAGA 2400 
CGTAAACATG CCTCTAGTAT CAAGGATCTC AOCAAACAAC TTCAGCAAGC ACGAAGAAAA 2460 
TTAGATCAGG TTGAGAGTGG AAGCTATGAC AAAGAAGTCA GCAGCATGGG AAGTCGTTCT 2520 
AGTTCATCAG GGTCCCTGAA TGCTCGAAGC AGTGCAGAAG ATOGATCTCC AGAAAATACT 2580 
GGGTCCTCAG TAGCTGTGGA TAACTTTCCA CAAGTAGATA AGGCCATGTT G ATTG AG AGA 2640 
ATAGTTAGGC TGCAAAAAGC ACATGCCGGG AAAAATGAAA AGATAGAATT TATGGAGGAC 2700 
CACATCAAAC AACTGGTGGA AGAAATTAGG AAAAAAACAA AAATAATTCA AAGTTATATT 2760 
TTACGAGAAG AATCAGGCAC ACTTTCTTCA GAGGCATCTG ATTTTAACAA AGTTCATTTA 2820 
AGTAGAGGGG GTGGCATCAT GGCATCTTTA TATACATOOC ATCCAGCTGA CAATGGATTA 2880 
ACATTGGAGC TCTCTTTGGA AATCAACCGA AAATTACAGG CTGTTTTGGA GGATACGTTA 2940 
CTAAAAAATA TTACTTTGAA GG AAAATCTA CAAACACTTG GAACAG AAAT AGAACGTCTT 3000 
ATTAAACAOC AGCATGAACT AGAACAGAGG ACAAAGAAAA CCTAAAACAA GCCTCTTGCT 3060 
CAGTAAAGAG ACAAAAGCCA CACAGGAGTA GGTGCCACTG ACCTCTATTG TTGGAGACTT 3120 
TGTTOCA Ci l 1T1G1 ITCAG CCAGTAAAAA T ATTOTtl 10 CTTCATCTGT ACACAAAAAA 3180 
ATACCCTTTT ACAATATGAA TGCATTGCTG TATATACTGT AAGACTGAAA GCTTTGATGA 3240 
AATTTGTTTT TGTATGGTGC AATATG ACAG CCTGTCATTG AATCTAAACA ACTTAATTTG 3300 
CTTGTATTCA TAAG AAGTGT TGAACATTAC AAGGGCTTTT AT 



Protein Accession*: NP.060487 

MVHYLSFCN YYMEFYREEL PHID YLIDIQ FATGKVTQPG EDTS YHQCAQ LEARDEGTDS 60 
LLLNNGSS AT LKTRTRCYGT PRGLPHRSLL QPTPPTCKTK KSRFEELQS ELVPVSMSET 120 
DH1ASTSSDK NVGKTPELKE DSCNLFSGNE SSKLENESKL LSLNTDKTLC QFNEHNNRIE 180 
AQENY IPDHG GGEDSCAKTD TGSENSEQIA NFPSGNFAl Oi jSKT NETEQK VTQ1LVELRS 240 
STFPESANEK TYSESPYDTD CTKKF1SKIK SVSASEDLLE BESELLSTB FAEHRVPNGM 30 0 
NKGEHALVLF EKCVQDKYUQ QEHUKKLIK ENKKHQELFV DICSEKDNLR EELKKRTETE 360 
KQHMNTIKQL ESRIEELNKE VKASRDQUA QDVTAKNAVQ QLHKEMAQRM EQANKKCEEA 420 
RQEK EAMVMK YVRGEKESLD LRKEKETLEK KLRPANKELB KNTNKBCQ LS QEK GRLHQLY 480 
ETKEGETTRL RHDKLKED INSHVKVKW AQNKLKAEMD SHKETKDKLK ETTTKLTQAK 540 
EEADQIRKNC QDMDCTYQES EEDCSNELDA KLRVTKGELE KQMQEKSDQL EMHHAKDKEL 600 
EDLKRTFKEG MDELRTLRTK VKCLEDERLR TEDELSKYKE HNRQKAHQ NLLDKVKTAD 660 
QLQEQLQRGK QEIENLKEEV ESLNSUNDL QKDIEGSRKR ESELLLFTER LTSKNAQLQS 720 
ESNSLQSQFD KVSCSE5QLQ SQCEQMKQTN 2NLESRLLKE EELRKEEVQT LQAELACRQT 780 
EVKALSTQVE ELKDELVTQR RKHASS 1KDL TKQLQQ ARRK LDQVESGS YD KEVSSMGSRS 840 
SSSGSLNARS SAEDRSPENT GSS VAVDNFP QVDKAMLIER IVRLQKAHAR KNEHEFMED 900 
HPCQLVEEIR KKTKPQSY1 LREESGTLSS EASD FNKVHL SRRGG1MASL YTSHPADNGL 960 
TLELS LEINR KLQAVLEDTL LKNTTLKENL QTLGTEIERL KHQHELEQR TKKT 



$^QfPNQ7252pBJ6Pf(A$WyWgg 
Nucleic Add Accessta* D33760 

Coding sequence: 56-1459 (undaflned sequence conesponds to start end slop codon) 

l 11 
I I 

TTGCCGTGAA GGGCTGTGCG 
CTCCACCACC CCCATCAGCT 
AGGCTGGAAG CAAGGAGAT6 
GAAGAAGTTA AAGAAGAAGA 
GGGGCAGCCC AGCAAATGCG 
CCACCGCAAG GGCCTGCCCC 
GTCCCACCAC GAGCTGAAGC 
AGAAGTGTGC ATTAACCCTT 
GCTCGTGCCA AGACACAGTG 
C6CCTCCCTG CACAGTGAGC 
GCAGCCTCCG TGCTCTGCAC 
CAC66CCAGC TACCCTCACT 
CTCAGTTGAC ACACCACCCC 



21 31 41 51 

III! 

GTTCCCGTGC GCGCCGGAGC CTGCTGTGGC CTCTTATGCA 60 

CCCTCTTCTC CTTCACCAGC CCCGCAGTGA AGAGACTSCT 120 

AAGAGGAAAA GTGGGCAGAG AAGGCAGTGG ACTCTCTAGT 180 

AGGGAGCCAT GGACGAGCT6 6AGAGGGCTC TCAGCTGCCC 240 

TCACGATTCC CCGCTCCCTG GACGGGCGGC TGCAGGTGTC 300 

ATGTGATTTA CTGTCGCGTG TGGC6CTGGC CGGATCTGCA 360 

CGCTGGAGTG CTGTGAGTTC CCATTTGGCT CCAAGCAGAA 420 

ACCACTACCG CC6G0T66A0 ACTCCAGTAC TGCCTCCTGT 480 

AATATAACCC CCAGCTCAGC CTCCTGGCCA AGTTCCGCAG 540 

CACTCATGCC ACACAACGCC ACCTATCCTG ACTCTTTCCA 600 

TCCCTCCCTC ACCCAGCCAC GCGTTCTCCC AGTCCCCGTG 660 

CCCCAGGAAG TCCTTCTGAG CCAGAGAGTC CCTATCAACA 720 

TGCCTTATCA TGCCACAGAA GCCTCTGAGA CCCAGAGTGG 780 
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CCAACCTOTA GATGCCACAG CTGATAGACA TGTAGTGCTA TCGATACCAA ATGGAGACTT 840 

TCGACCAGTT TGTTACGAGG AGCCCCACCA CTCGTGCTCG GTCGCCTACT ATGAACTGAA 900 

CAACCGAGTT GGGGAGACAT TCCAGGCTTC CTCCCGAAGT GTGCTCATAG AT666TTCAC 960 

CGACCCTTCA AATAACAGGA ACAGATTCTG TCTTGGACTT CTTTCTAATG TAAACAGAAA 1020 

CTCAACGATA GAAAATACCA G GAG AC AT AT AGGAAAGGGT GTGCACTTGT ACTACGTCGG 1080 

GGGAGA6GTG TATGCCGAGT GCGT6AGTGA CAGCAGCATC TTTGTGCAGA GCCGGAACTG 1140 

CAACTATCAA CACGGCTTCC ACCCAGCTAC CGTCTGCAAG ATCCCCAGCG GCTGCAGCCT 1200 

CAAGGTCTTC AACAACCAGC TCTTCGCTCA GCTCCTGGCC CAGTCAGTTC ACCACGGCTT 1260 

TGAAGTCGTG TATGAACTGA CCAAGATGTG TACTATCCGG ATGAGTTTTG TTAAGGGTTG 1320 

GGGTGCTGAG TATCATCGCC AGGATGTCAC CAGCACCCCC TGCTGGATTG AGATTCATCT 1380 

TCATGGGCCA CTGCAGTGGC TGGACAAAGT TCTGACTCAG ATGGGCTCTC CACATAACCC 1440 
CATTTCTTCA GTGTC TTAA C AGTCATGTCT TAAGCTGCAT TTCCATAGGA T 



Protein Accession*: NMIO^a^ PBJ6 Pmteln seouence: 

MHSTTPBSL FSFTSPAVKR LLGWKQGDEE EKWAEKAVDS LVKKLKKKKG AMDELERALS 60 
CPGQPSKCVT IPRSLDGRLQ VSHRKGLPHV IYCRVWRWPD LQSHHELKPt ECCEFPFGSK 120 
QKEVONPYH YRRVETPVLP PVLVPRHSEY NPQLSLLAKF RSASLHSEPL MPHNATYPDS 180 
FQQPPCSALP PSPSHAFSQS PCTASYPHSP GSPSEPESPY QHSVDTPPLP YHATEASBTQ 240 
SGQPVDATAD RHWLS1PNG DFRPVCYEEP QHWCSVAYYE LNNRVGETFQ ASSRSVUDG 300 
FTDPSNNRNR FCLGLLSNVN RNSTENTRR HIGKGVHLYY VGGEVYAECV SDSSDFVQSR 360 
NCNYQHGFHP ATVCKIPSGC SLKVFNNQLF AQLLAQSVHH GFEWYELTK MCTTRMSFVK 420 
GWGAEYHRQD VTSTPCWIEI HLHGPLQWLD KVLTQMGSPH NPISSVS 



SEQ F> Nftffi FW9 PNA whence. 
Nucleic Add Accession*: AB04684 

Coding sequence: 472-4377 (undeiined sequence corresponds to start and stop codon) 

l 11 21 31 41 51 

1 I I I i I 

TGCAGGTTTG CAGGGTCTGA GATTACTTGG GCTTTTCCTG CCTTTTTCTT TTGCTTAAGG 60 

GATG6ACAAG GAGCTGAGAT TTATGACCCT TATTAGAGAA AAAAATGTGC CTTGCTAGGG 120 

TGGGGACACT TGGTTGATGC AGTCTCTCTC TCTCTTTCTC GGTGTTTATA ACAAAACAAA 180 

ACCAAAATGA ACTGAGGGGT TTGTAATGGT AGTTTGTTTG TTGCTGGAGA ATGCTACTTT 240 

GCATGCTTTT TTTCTCTTGC AGGGTATGTT CTGTCTTGTG CTTTTTCTTT TAGAAGCTAC 300 

TAAAGGGTGT TGGGGATGCT TCTGACTATT ATGAAGGCCA AAAGGCCTGT TGACTGGGGC 360 

TGCTTTTAAC CCTTTCCTAT TTGCTGAGAA TGCAGCCGTG TGACAGTAAC TGAACATTGG 420 

TCTAAAGTCT TTCCAAAAGG TCAAGGTTCA CAAGAACATC TGCTCAAATT AATGACCATG 480 

GGGGATATGA AGACCCCAGA CTTTGATGAC CTCCTGGCAG CATTTGACAT CCCAGATATG 540 

GTCGATCCTA AAGCAGCTAT TGAGTCTGGA CACGATGACC ATGAAAGCCA CATCAAGCAG 600 

AATGCTCACG GAGAGGATGA CTCCCACGCA CCATCATCTT CTGATGTGGG TGTCAGCGTT 660 

ATCGTCAAGA ATGTTCGGAA CATT6ACTCT TCCGAGGGCO GGGA6AAAGA CGGCCACAAC 720 

OCCACTGGCA ATGGCTTACA TAATGGGTTT CTCACAGCAT CCTCCCTTGA CAGTTACAGT 780 

AAAGATGGAG CAAAGTCCTT GAAAGGAGAT GTGCCTGCCT CTGAGGTGAC ACTGAAAGAC 840 

TCGACATTCA GCCAGTTTAG CCCGATCTCC AGTGCTGAAG AGTTTGATGA CGACGAGAAG 900 

ATTGAGGTGG ATGACCCCCC TGACAAGGAG GACATGCGAT CAAGCTTCAG GTCGAATGTG 960 

TTGACGGGGT CGGCTCCCCA GCAGGACTAC GATAAGCTGA AGGCACTCGG AGGGGAAAAC 1020 

TCCAGCAAAA CTGGACTCTC TACGTCAGGC AATGTGGAGA AAAACAAAGC TGTTAAGAGA 1080 

GAAACAGAAG CCAGTTCTAT AAACCTGAGT GTTTATGAAC CTTTTAAAGT CAGAAAAGCA 1140 

GAGGATAAAT TGAAGGAAAG CTCTGACAAG GTGCTGGAAA ACAGAGTCCT AGATGGGAAG 1200 

CTGAGCTCCG AGAAGAATGA CACCAGOCTC CCCAGCGTTG CGCCATCAAA GACAAAGTCG 1260 

TCCTCCAAGC TCTCGTCCTG CATCGCTGCC ATCGCGGCTC TCAGCGCTAA AAAGGCGGCT 1320 

TCAGACTCCT GCAAAGAACC AGTGGCCAAT TCGAGGGAAT CCTCCCCGTT ACCAAAAGAA 1380 

GTAAATGACA GTCCGAGAGC CGCTGACAAG TCTCCTGAAT CCCAGAATCT CATCGA0GG6 1440 

ACCAAAAAAC CATCCCTGAA GCAACCGGAT AGTCCCAGAA GCATCTCAAG TGAOAACAGC 1500 

A6CAAAGGAT CCOCGTCCTC TCCCGCAGGG TCCACACCAG CAATCOCCAA AGTCGGCATA 1560 

AAAACCATTA AGACATCTTC TGGGGAAATC AAGAGAACAG TGACCAGGGT ATTGCCAGAA 1620 

GTGGATCTTG ACTCTGGAAA GAAACCTTCC GAGCAGACAG CGTCCGTGAT GGCCTCTGTG 1680 

ACATCCCTTC TGTCGTCTCC AGCATCAGCC GCCGTCCTTT CCTCTCCCCC CAGGGCGCCT 1740 

CTCCAGTCTG CGGTCGTGAC CAATGCAGTT TCCCCTGCAG AGCTCACCCC CAAACAGGTC 1800 

ACAATCAAGC CTGTGGCTAC TGCTTTCCTC CCAGTGTCTG CTGTGAAGAC GGCAGGATCC 1860 

CAAGTCATTA ATTTGAAGCT CGCTAACAAC ACCACGGTGA AAGCCACGGT CATATCTGCT 1920 

GCCTCTGTCC AGAGTGCCAG CAGCGCCATC ATTAAAGCTG CCAACGCCAT CCAGCAGCAA 1980 

ACTCTCGTGG TGCCGGCATC CAGCCTGGCC AATGCCAAAC TCGTGCCAAA GACTGTGCAC 2040 

CTTGCCAACC TTAACCTTTT GCCTCAGGGT GCCCAGGCCA CCTCTGAACT CCGCCAAGTG 2100 

CTAACCAAAC CTCAGCAACA AATAAAGCAG GCAATAATCA ATGCAGCAGC CTCGCAACCC 2160 

CCCAAAAAGG TGTCTCGAGT CCAGGTGGTG TCGTCCTTGC AGAGTTCTGT GGTGGAAGCT 2220 

TTCAACAAGG TGCTGAGCAG TGTCAATCCA GTCCCTGTTT ACATCCCAAA CCTCAGTCCT 2280 

CCCGCCAATG CAGGGATCAC GTTACCGACG CGTGGGTACA AGTGCTTGGA GTGTGGGGAC 2340 

TCCTTTGCAC TTGAAAAGAG TCTGACCCAG CACTACGACA GACGGAGCGT GCGCATCGAA 2400 

GTAACGTGCA ACCATTGTAC AAAGAACCTC GTTTTTTACA ACAAATGCAG CCTCCTTTCC 2460 

CATGCCCGTG GGCATAAGGA GAAAGGGGTG GTAATGCAAT GCTCOCACTT AATTTTAAAG 2520 

CCAGTCCCAG CAGATCAAAT GATAGTTTCT CCGTCAAGCA ATACTTCCAC TTCAACTTCC 2580 

ACTCTTCAGA GCCCTGTGGG AGCTGGCACA CACACTGTCA CAAAAATTCA GTCTGGCATA 2640 

ACTGGGACAG TCATATCGGC TCCTTCAAGC ACTCCCATCA CCCCAGCCAT GCCCCTAGAT 2700 

GAAGACCCCT CCAAACTGTG TAGACATAGT CTAAAATGTT TGGAGTGTAA TGAAGTCTTC 2760 

CAGGACGAGA CATCACTGGC TACACATTTC CAGCAGGCTG CAGATACGAG TGGACAAAAG 2820 
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ACTTGCACTA TCTGCCAGAT GCTGCTTCCT AACCAGTGCA GTTATGCATC ACACCAGAGA 2880 

ATCCATCAGC ACAAATCTCC CTACACCTGC CCTGAGTQTO GGGCCATCTG CAGGTCGGTO 2940 

CACTTCCAGA CCCACGTCAC CAAGAACTGT CTGCACTACA CGAGGAGAGT TQGTTTTC6A 3000 

TGTGTGCATT GCAATGTTGT GTACTCTGAT GTGGCTGCTC TQAAOTCTCA CATTCAAGGT 3060 

TCTCACTGTG AAGTCTTCTA CAAGTGTCCT ATTTGTCCAA TGGCGTTTAA GTCTGCCCCA 3120 

AGCACACATT CCCACGCCTA CACACAGCAT CCTGGCATCA AGATAGGA6A ACCAAAAATA 3180 

ATATATAAGT GTTCCATGTG CGACACTGTG TTCACCCTGC AAACCTTGCT GTATCGCCAC 3240 

TTTGACCAAC ACATTGAAAA CCAGAAGGTG TCTGTTTTCA AGTGTCCAGA CTGTTCTCTT 3300 

TTATATGCAC AGAAGCAACT TATGATGGAC CATATCAAGT CTATGCATGG AACATTGAAA 3360 

AGTATTGAAG GGCCTCCAAA CTTGGGTATA AACTTGCCTT TGAGCATTAA GCCTGCAACT 3420 

CAAAATTCAG CAAATCAGAA CAAAGAGGAC ACCAAATCCA TGAATGGGAA AGAGAAATTG 3480 

GAAAAGAAAT CTCCATCTCC TGTGAAAAAA TCAATGGAAA CCAAGAAAGT GGCCAGTCCT 3540 

GGGTGGACGT GTTGGGAGTG TGACTGCCTG TTCATGCAGA GAGATGTGTA CATATCCCAC 3600 

GTGAGQAAGG AGCACGGGAA GCAAATGAAG AAACACCCCT GCCGCCAGTG TGACAAGTCT 3660 

TTCAGCTCGT CCCACAGCCT GTGCCGGCAC AACCGGATCA AGCACAAAGG CATCAGGAAA 3720 

GTGTACGCCT GCTCGCACTG CCCAGACTCC AGACGTACCT TTACCAAACG TTTGATGCTG 3780 

GAGAAGCACG TCCAGCTGAT GCATGGCATC AAGGACCCTG ACCTGAAAGA AATGACAGAT 3840 

GCCACCAATG AGGAGGAAAC AGAAATAAAA GAAGACACTA AGGTCCCCAG TCCCAAGCGG 3900 

AAGTTGGAAG AACCAGTTCT GGAGTTCAGG CCTCCCCGAG GAGCAATCAC TCAACCACTG 3960 

AAAAAGCTGA AAATCAATGT TTTTAAGGTT CACAAGTGTG CCGTGTGTGG CTTCACCACC 4020 

GAAAACCTGC TGCAATTCCA CGAACACATC CCTCAGCACA AATCGGATGG TTCTTCCTAC 4080 

CAGTGCCGGG AGTGTGGCCT CTGCTACACG TCTCACGTCT CTCTGTCCAG GCACCTCTTC 4140 

ATCGTACACA AGTTAAAGGA ACCTCAGCCA GTGTCCAAGC AAAATGGGGC TGGGGAAGAT 4200 

AACCAACAGG AGAACAAACC CAGCCACGAG GATGAATCCC CTGATGGCGC CGTGTCAGAC 4260 

AGAAAGTGCA AAGTGTGCGC AAAAACTTTT GAAACTGAAG CTGCCTTAAA TACTCACATG 4320 

CGGACACACG GCATGGCCTT CATCAAATCC AAAAGGATGA GCTCAGCCGA GAAATAGCCA 4380 

CAGATGCTCC ATGAGGAAAA TCCCTGTCCA CATTGGAATA AAAAAGACAT TTTTGTTACA 4440 

AAGTTTGCAG TATAATAGAG TTAACAGTAC TGTCTAGGCT GTTGCAATAT ATTCTCTTTC 4500 

AATGTACCTT CCTTCACCTC GTCGTATATA TCCTCGATAA GTA3TAAAAC AGTATTTGAG 4560 

1TTAAAAGAG TTTGTATATA TTTAAATGAA TAACTTTTTA TACTCTTTGT TACATGTTTG 4620 

TATCAGTATT TAGTGGAAAA CCATTTGAGT TGTTTTGGGT TAGAATTTTT CTTTTTGTAC 4680 

TG TTT C TTTA AAACAGAGTT CTTAGTAACA GGGGCAGTTC CTGAATTCAA ATAAACCATT 4740 

TTGTATGTTT GGATTTTGAA TGGGTTAACT AATTACAGGC TAAAATAATG CCTTTTTTAG 4800 

TGTTTTTAAT TTTTAGAATT CACTACATAA ATTGTAAGTA ATTGTGGGTC TCAAAAACAC 4860 

TAGGAACTTT TAAGTGTCTT AGCACTTCCT CGATGTGCCT GOCCTGAGGG AGTGAGTTCA 4920 

CATTTGAGAC AACTGCACTC CAGTGTGGAC GTGCCTTTGT CTTCAGGCCA TGCCGAAGGG 4980 

TGTTTAAAGC AGTCTTGCAG GTCGCTCCTT TCCCAGCCGT GGATAAAAAC TGAAGCTAGG 5040 

AATCTAATAA GGAATGCTGA TTTCCTCAGT TCCATTTTGA GGAATGGGGA AGGCTAITCT 5100 

AAAGAAAAAA ATGGGATTTG TTTTCTCGGC AGATCTGCAA GGCTGGCTTT AAGAGCACAA 5160 

GGAGGGAAAG TAACGAAAGG GCTGGACTAC TATAAAAGTT ACAAATACGT AGTTAGACCA 5220 

ATAGATTTAT ATAGTCAGGT TTTTGTCATG TAATTTATTA ACTAACTATT ACAGAAACAC 5280 

AGCTAAGAAT ATCAAGTATT TCTCTGGCTC TTGACAGAAA AAAATCAGTT GACTTAACCC 5340 

TTTGCTGTCA AAAGAGTTGG CGTTTCCTGT TCTGGGTGCT ACTGCCAAAC GTTATGGTAC 5400 

TTAGAGTCGG GATGCACAAC TTCAACCACC GACTTATCAA TGCAGCCGCC TGTGTATTGC 5460 

AATTGGCCGT TACCTTAAGC ACTGAGCCAC CCGGGTTTAG TTCAGCCATT TCAAGAAGTA 5520 

TATTTAACGT CGGTAGTTCT GCTTTATTAA AATGCAGCAG AGGTACTCTT CTGTCCCTTC 5580 

CGTTTATAGT TCTCTGAGAG AGTTCTATTT TTTGGTTTTG TTTTGTGTTT TCTTTTGCAT 5640 

TTTGTATCTT GTATTTATCC CTGAACATGT TTTGTACCTT TCTTTTTTTT TTTTTTTTAA 5700 

GAAAAGGAAT TCTTTTGTGT ATATATAGAT ACTTGCATGA TATACTGTAG TCAATGTTCG 5760 

GTTCCTCAAA AGGTCTTGCT GCTGTCAGGT GTTATGCACT CCATCCATCA TAACTGTATG 5820 
AAACACATTT CATATGTAAA TAAACGTGGG ACATTTG 



gEQ p Nftffi PBJ8 Prtfglri SOTTO; 
Protein Accession t: BAB13455 

MKTPDFDDLL AAFDIPDMVD PKAAJESGHD DHESHMKQNA HGEDDSHAPS SSDVGVSVIV 60 
KNVRNIDSSE GGEKDGHNFT GNGLHNGFLT ASSLDSYSKD GAKSLKGDVP ASEVTLKDST 120 
FSQFSPISS A EEFDDDEKIE VDDPPDKEDM RSSFRSNVLT GSAPQQDYDK LKALGGENSS 180 
KTGLSTSGNV EKNKA VKRET EASSINLSVY EPFKVRKAED KLXESSDKVL ENRVLDGKLS 240 
SEKNDTSLPS VAPSKTKSSS KLSSCIAAIA ALSAKKAASD SCKEPVANSR ESSPLPKEVN 300 
DSPRAADKSP ESQNLIDGTK KPSLKQPDSP RSISSENSSK GSPSSPAGST PAIPKVRIKT" 360 
KTSSGEIKR TVTRVLPEVD LDSGKKPSEQ TASVMASVTS LLSSPASAAV LSSPPRAPLQ 420 
S AWTNAVSP AELTPKQVTI KPVATAFLPV SAVKTAGSQV INLKLANNTT VKATVIS AAS 480 
VQSASSAUK AANAIQQQTV WPASSLANA KLVPKTVHLA NLNLLPQGAQ ATSELRQVLT 540 
KPQQQIKQAI INAAASQPPK KVSRVQWSS LQSSWBAFN KVLSSVNPVP VYIPNLSPPA 600 
NAGITLPTRG YKCLECGDSF ALEKSLTQHY DRRSVRIEVT CNHCTKNLVF YNKCSLLSHA 660 
RGHKEKGWM QCSHULKPV PADQMIVSPS SNTSTSTSTL QSPVGAGTHT VTKIQSGrTG 720 
TVISAPSSTP riPAMPLDED PSKLCRHSLK CLECNEVFQD ETSLATHFQQ AADTSGQKTC 780 
TICQMLLPNQ CSYASHQRIH QHKSPYTCPE CGAICRS VHF QTHVTKNCLH YTRRVGFRCV 840 
HCNWYSDVA ALKSHIQGSH CEVFVKCPIC PMAFKSAPST HSHAYTQHPG KIGEPKDY 900 
KCSMCDTVFT LGTLLYRHFD QHffiNQK VSV FKCPDCSLLY AQKQLMMDHI KSMHGTLK5I 960 
EGPPNLGINL PLSIKPATQN SANQNKEDTK SMNGKEKLEK KSPSPVKKSM ETKKVASPGW 1020 
TCWECDCLFM QRDVYISHVR KEHGKQMKKH PCRQCDKSFS SSHSLCRHNR IKHKGERKVY 1080 
ACSHCPDSRR TFTKRLMLEK HVQLMHGIKD PDLKEMTDAT NEEETEIKED TKVPSPKRKL 1140 
EEPVLEFRPP RGAJTQPLKK LKINVFKVHK CAVCX3FTTEN LLQFHEHIPQ HKSDGSSYQC 1200 
REOGLCYTSH VSLSRHLFTV HKLKEPQPVS KQNGAGEDNQ QENKPSHEDE SPDGAVSDRK 1260 
CKVCAKTFET EAALNTHMRT HGMAF1KSKR MSSAEK 
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Nucleic Acid Accession* AF111847 

Cooing sequence: 58-1608 (undertmed sequence corresponds to start and stop codcn) 

l 11 21 31 41 51 

I I I I I ! 

TTTTCGTCGA CTCTTACCGG TTGGCTGGCC CAGCTGCGCC GCGGCTCACA GCTGACGATQ 60 

GGGGACCCCA GCAAGCAGGA CATCTTGACC ATCTTCAAGC GCCTCCGCTC GGTGCCCACT 120 

AACAAGGTGT GTTTTGATTG TGGTGCCAAA AATCCCA6CT GGGCAAGCAT AACCTATGGA 180 

GTGTTCCTTT GCATTGATTG CTCAGGGTCC CACOGGTCAC TTGGTGTTCA CTTGAGTTTT 240 

ATTCGATCTA CAGAGTTGGA TTCCAACTGG TCATGGTTTC AGTTGCGATG CATGCAAGTC 300 

GGAGGAAACG CTAGTGCATC TTCCTTTTTT CATCAACATG GGTGTTCCAC CAATGACACC 360 

AATGCCAAGT ACAACAGTCG TGCTGCTCAG CTCTATAGGG AGAAAATCAA ATCGCTCGCC 420 

TCTCAAGCAA CACGGAAGCA TGGCACTGAT CTGTGGCTTG ATAGTTGTGT GGTTCCACCT 480 

TTGTCCCCTC CACCAAAGGA GGAAGATTTT TTTGCCTCTC ACGTITCTCC TGAGGTGAGT 540 

GACACAGCGT GGGCATCAGC AATAGCAGAA CCATCTTCTT TAACATCAAG GCCTGTGGAA 600 

ACCACTTTGG AAAATAATGA AGGTGGACAA GAGCAAGGAC CAAGTGTGGA AGGTCTTAAT 660 

GTACCAACAA AGGCTACTTT AGAGGTATCC TCTATCATAA AAAAGAAACC AAATCAAGCT 720 

AAAAAAGGCC TTGGGGCCAA AAAAGGAAGT TTGGGAGCTC AGAAACTGGC AAACACATGC 780 

TTTAATGAAA TTGAAAAACA AGCTCAAGCT GCGGATAAAA TGAAGGAGCA GGAAGACCTG 840 

GCCAAGGTGG TATCTAAAGA AGAATCAATT GTTTCATCAT TACGATTAGC CTATAAGGAT 900 

CTTGAAATTC AAATGAAGAA AGACGAAAAG ATGAACATTA GTGGCAAAAA AAATGTTGAC 960 

TCAGACAGAC TCGGCATGGG ATTTGGAAAT TGCAGAAGTG TTATTTCACA TTCAGTGACT 1020 

TCAGATATGC AGACCATAGA GCAGGAATCA CCCATTATGG CAAAACCAAG AAAAAAGTAT 1080 

AATGATGACA GTGACGATTC ATATTTTACT TCCAGCTCAA GTTACTTTGA CGAGCCAGTG 1140 

GAGTTAAGGA GCAGTTCTTT CTCTAGCTGG GATGACAGTT CAGATTCCTA TTGGAAAAAA 1200 

GAGACCAGCA AAGATACTGA AACAGTTCTG AAAACGACAG GCTATTCAGA CAGACCTACT 1260 

GCTCGCCGCA AGCCAGATTA TGAGCCAGTT GAAAATACAG ATGAGGCCCA GAAGAAGTTT 1320 

GGCAATGTCA AGGCCATTTC ATCAGATATG TATTTTGGAA GACAATCCCA GGCTGATTAT 1380 

GAGAOCAGGG CCCGCCTAGA GAGGCTGTCG GCAAGTTCCT CCATAAGCTC GGCTGATCTG 1440 

TTCGAGGAGC CGAGGAAGCA GCCAGCAGQG AACTACAGCC TGTCCAGTGT GCTGCCCAAC 1500 

GCCCCCGACA TGGCGCAGTT CAAGCAGGGA GTGAGATCGG TTGCTGGAAA ACTCTCCGTC 1560 

TTTGCTAATG GAGTCGTGAC TTCAATTCAG GATCGCTACG GTTCTTAATA CTGAAGTCAT 1620 

GATGTGTATT TCCTGGAGAA ATTCCTCTTT AAATGAAGAA GTAACCACAT CTCAGGCGGC 1680 

AGTGAAGTCC AGATAGTTTT GCAGATTGTT TTGCTACTTT TTCATATGGT ATATGTTTCT 1740 

GATTTTTAAT ATTTCTTTTG AGAAATTCTG AGTTCTGATG TAGGAGCTTT CCTGTGATTT 1800 

CTGTTTCACG TTCCTTCCTG TCACACCCTC CTTTGGCGTC TCTGTGTATA TCCTTGCTTT 1860 

ATTTTCTTGG AACCTTTGAT TTCAACACTG AGGGCCTGGA GACCTCGGCT CCTCCTGCTC 1920 

CTGAACCAGG AGGCTTCATG TGGGGGAGGA GGAGAGGTCT CCATGTGACA CATGGGCTCA 1980 

GGGCTGCCAG AATCAGCGGA TGCTGGATGG GCCTGCAGAA ACAACACTCA CCACACACAC 2040 

TTCCTTCAAA AGACCAAAAG TGACTGGTGT CTCGTGTGAC AGATTGCTTC ATTTATGTTT 2100 

CTACATAGTA AGGTGACTGC CAAATAATAT TTGAAGTCAT CTGTCTCTTT GTAAATTATT 2160 

TTATATGACC TATAAATTTA AAAATGTTTT TCAGTGAGTG CTTTTAACAA ACTTAAGCTT 2220 

CTGCCC7GCC AAGGGAATXA ATGTTATCTT GTGAAAGGTG TTGCTGTTTG AATTGATGAC 2280 

AAATGGAAGA TGAGAACTCC CTAAGAGTTC TCATAATAAA TCATCTCATC ACAAATCAAT 2340 

ACGGTATACA GAGTTAAAGT GGAATGAGGT AAGAAGATAC AGCTACAGAA AATAGTTGCG 2400 

TGTATGGGAG AACAGTCATT GTAATTGGGT AGTTTTGTTA ATAAATATTT TTAAATCTTG 2460 

CTTTTCAGAA ATTACCGAAT GTGTATAAAC AAATAAAGAA AAATAATTTA GCTGTGTTTT 2520 

AGACAGCATT AGAATATATT GTTCAGCACA GTAAAATATA TTTGAAATTT GATAAGCCAA 2580 

AAATGTGGTT TTGAATGAAT ATTTTGTGAA TCTTTCTTAA AAGCTCAAAT TTGTAGACTT 2640 

CTAAATAGAA TAAACACTTG CAGCAGAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 2700 

AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 2760 
AAAAAAAAAA AAAAAAAAAA AAAAAAAA 



SEQID H0.-2S7PBM1 Pmidn sequence: 
PBM1 Protein sequence CAB76901 

MGDPSKQDIL TIFKRLRS VP TNKVCFDCGA KNPSWASITY GVFLC3DCSG SHRSLGVHLS 60 
FKSTELDSN WSWFQLRCMQ VGGNASASSF FHQHGCSTND TNAKYNSRAA QLYREKKSL 120 
ASQATRKHGT DLWLDSCWP PLSPPPKEED FFASHVSPEV SDTAWASAIA EPSSLTSRPV 180 
ETTLENNEGG QEQGPSVEGL NVPTKATLEV SSIIKKKPNQ AKKGLGAKKG SLGAQKLANT 240 
CFNEIEKQAQ AADKMKEQED LAKWSKEES IVSSLRLAYK DLEIQMKKDE KMNBGKKNV 300 
DSDRJjGMGFG NCRSVKHSV TSDMQTIEQE SPIMAKPRKK YNDDSDDSYF TSSSS YRDEP 360 
VELRSSSFSS WDDSSDSYWK KETSKDTETV UCTTGYSDRP TARRKPDYEP VENTDEAQKK 420 
FGNVKAISSD MYK3RQSQAD YETRARLERL SASSSISSAD LFEEPRKQPA GNYSLSSVLP 480 
NAPDMAQFKQ GVRSVAGKLS VFANGWTSI QDRYGS 



SEQiDWO:258PBM4DNAseouence 
Nucleic Add Accession* D30891 

Coding sequence: 1-4032 (underlined sequence corresponds to start and stop codon) 

AJJQGATACTG TCATGAAGCA G ACACATGCT GACACACCTG TTGATCATTG TCTATCTGGC 60 
ATAAGAAAGT GTAGCAGCAC CTTTAAGCTT AAAAGTGAAG TCAACAAGCA TGAAACAGCC 120 
CTTG AAATGC AGAATOCAAA TTTGAACAAT AAAGAATGTT GTTICAOCTT TACGTTGAAT 180 
GGAAACTCCA GAAAATTAGA CCGTAGTGTG TTTACAGCAT ATGGTAAACC CAGCGAGAGT 240 
ATCTACTCAG CCCTG AGTGC TAATGACTAT TTCAGTG AAA GG ATAAAGAA TCAGTTTAAT 300 
AAGAACATTA TTGTTTATGA AG AAAAGACA ATAGATGGAC ATATAAATTT AGGAATGCCT 360 
CTCAAGTGCC TGCCTAGTG A TTCTCATTTT AAAATTACAT TTGGTCAAAG AAAG AGTAGC 420 
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AAAGAAGATG GACACATATT ACGCCAATGT G AAAATCCAA ACATGGAATG CATTCTTTTT 480 
CATGTTGTTG CTATAGG AAG GACAAG AAAG AAGATTGTTA AGATCAACGA ACTTCATGAA 540 
AAAGGAAGTA AACTTTGTAT TTATGCCTTG AAGGGTGAGA CTATTGAAGG AGCCTTATGC 600 
AAGGATGGCC GTTTTCGGTC TGACATAGGT GAATTTGAAT GGAAACTAAA GGAAGGTCAT 660 
AAGAAAATTT ATGGAAAACA GTCCATGGTG GATGAAGTAT CTGGAAAAGT CTTAGAAATG 720 
GACATTTCAA AAAAAAAAGC ATTACAACAG AAAGATATCC ATAAAAAAAT TAAACAGAAT 780 
GAAAGTGCCA CTGATGAAAT TAATCACCAG AGTCTOATAC AGTCTAAGAA AAAAGTCCAC 840 
AAACCAAAGA AAGATGGAGA GACCAAAGAT GTAGAACACA GCAGAGAGCA AATTCTCCCA 900 
CCTCAGG ATC TAAGCCATTA TATTAAAGAT AAAACTCGCC AGACAATTCC CAGGATTAGA 960 
AATTATTACT TTTGTAGTTT GCCCCGAAAA TATAGGCAAA TAAACTCACA AGTTAGACGG 1020 
AGGCCGCATC TGGGTAGGCG GTATGCTATT AATCTGGATO TCCAAAAGGA GGCAATTAAT 1080 
CTCTTAAAGA ATTATCAAAC GTTGAATGAA GCCATAATGC ATCAGTATCC GAATTTTAAA 1 140 
GAGGAGGCAC AGTGGGTAAG AAAATATTTT CGGGAAGAAC AAAAGAGAAT GAATCTTTCA 1200 
CCAGCTAAGC AATTCAACAT ATATAAAAAG GACTTCGGAA AAATGACTGC AAATTCTGTT 1260 
TCAGTTGCAA CCTGCGAACA GCTTACATAT TATAGCAAGT CAGTTGGGTT CATGCAATGG 1320 
GACAATAATG GAAACACAGG TAATGCTACT TGCTTTGTCT TCAATGGTGG TTATATTTTC 1380 
ACCTGTCGAC ATQTTGTACA TCTTATGGTG GGTAAAAACA CACATCCAAG TTTGTGGCCA 1440 
GATATAATT A GCAAATGTGC GAAGGTAACC TTCACTTATA CAGAGTTCTG CCCTACTCCT 1500 
G ACAATTGGT TTTCCATTG A GCCATGGCTT AAAGTGTCCA ATG AAAATCT AG ATTATGCC 1560 
ATTTTAAAAC TAAAAGAAAA TGGAAATGCG TTTCCTCCAG GACTATGGCG ACAGATTTCT 1620 
CCTCAACCAT CTACTGGTTT GATTTATTTA ATTGGTCATC CTGAAGGCCA GATCAAGAAA 1680 
ATAG ATGGTT GTACTGTG AT TCCTCTAAAC GAACG ATTG A AAAAATATCC AAACGATTGT 1740 
CAAGATGGGT TGGTAGATCT CTATGATACC ACCAGTAATG TATACTGTAT GTTTACCCAA 1800 
AGAAGTTTCC TATCAGAGGT TTGGAACACA CACACGCTTA GTTATG ATAC TTGTTTCTCT 1860 
GATGGGTCCT CAGGCTCCCC AGTGTTTAAT GCATCTGGCA AATTGGTTGC TTTGCATACC 1920 
TTTGGGCTTT TTTATCAACG AGGATTTAAT GTGCATGCCC TTATTGAATT TGGTTATTCT 1980 
ATGGATTCTA TTCTTTGTGA TATTAAAAAG ACAAATGAGA GCTTGTATAA ATCATTAAAT 2040 
GATGAGAAAC TTGAGACCTA CGATGAAGAG AAAGCCCGGC CCAGGCCAGC CTACCGGCGA 2100 
CTAGGATGCT TTCGCTTTCG CTCTCGCTTT CCAATACTCG GGACTGGGGA AACCGGGAGA 2160 
ATAGAAGCAG GCAAGGACCG CCGTGGGCAC GGGGTCAGTG AGACAGGGTC CTGCTCGCGG 2220 
CGTCAAGGAG GAGCGCTGTG GGTGTCCCCA GCGCAGCCAA TCGGCTTCCG AAGTAGCTGG 2280 
AGCTCTGGAG CCTTTGCTTC CTCAAATACG AGCGGGAACT GCGTTGAGCG CTGGATTCCA 2340 
GGCCGAGTGC TGGCGAGGCG CGCAGTCTCT AAAGAGCAAC AGAATAATTG CAGTACTTCT 2400 
CTAATGAGGA TGGAGTCTAG AGGAGACCCA AGAGCCACAA CTAATACCCA GGCTCAAAGA 2460 
TTCCATTCAC CTAAGAAAAA TCCAGAAGAC CAGACCATGC CCCAAAATAG GACAATATAT 2520 
GTTAOCTTGA AGGCTGTCAG AAAAG AGATA G AAACTCACC AAGGOCAAGA AATGCTTGTG 2580 
CGTGGCACAG AAGGAATCAA AG AGTACATA AACCTTGGAA TCJCCCCTCAG TTGTTTCCCT 2640 
GAAGGTGGCC AGGTGGTCAT TACATTTTCC CAAAGTAAAA GTAAGCAGAA GGAAGATAAC 2700 
CACATATTTG GCAGGCAGGA CAAAGCATCG ACTGAATGTG TCAAATTTTA CATTCATGCA 2760 
ATTGGAATTG GGAAGTGTAA AAGAAGGATT GTTAAATGTG GG AAGCTTCA CAAAAAGGGG 2820 
CGCAAACTCT GTGTTTATGC TTTCAAAGGA GAAACCATCA AGGATGCACT GTGCAAGGAT 2880 
GGCAGATTTC TTTCCTTTCT GGAGAATGAT GATTGGAAAC TCATTGAAAA CAATGACACC 2940 
ATTTTAGAAA GCACCCAGCC AGTTGATGAA TTAGAAGGCA GATACTTTCA GGTTGAGGTT 3000 
GAGAAAAGAA TGGTCCCCAG TGCAGCAGCT TCTCAGAATC CTGAGTCAGA GAAAAGAAAC 3060 
ACCTGTGTGT TGAGAGAACA AATCGTGGCT CAGTACCOCA GTTTG AAAAG AGAAAGTGAA 3120 
AAAATCATTG AAAACTTCAA GAAAAAAATG AAAGTAAAAA ATGGGGAAAC ATTATTTGAA 3180 
TTGCATAGAA CAACX5TTTGG GAAAGTAACA AAAAATTCTT CTTCG ATTAA AGTAGTGAAA 3240 
CTTCTTGTAC GTCTCAGTGA CTCAGTTGGG TACTTATTCT GGGACAGTGC AACTACGGGT 3300 
TACGCCACCT GCTTTGTTTT TAAAGGATTG TTCATTTTAA CTTGTCGGCA TGTAATAG AT 3360 
AGCATTGTGG GAG ACGGAAT AGAGCCAAGT AAGTGGGCAA CCATAATTGG TCAATGTGTA 3420 
AGGGTGACAT TTGGTTATGA AGAGCTAAAA GACAAGGAAA CAAACTACTT TTTTGTTGAA 3480 
CCTTGGTTTG AGATACATAA TGAAGAGCTT GACTATGCTG TCCTGAAACT GAAGGAAAAT 3540 
GGACAACAAG TACCTATGGA ACTATATAAT GGAATTACTC CTGTGCCACT TAGTGGGTTG 3600 
ATACATATTA TTGGCCATCC ATATGGAGAA AAAAAGCAGA TTGATGCTTG TGCTGTGATC 3660 
CCTCAGGGTC AGOGAGCAAA GAAATGTCAG GAACGTCTTC AGTCTAAAAA AGCAGAAAGT 3720 
CCAGAGTATG TCCATATGTA T ACTCAAAGA AGTTTCCAG A AAATAGTTCA CAACOCTGAT 3780 
GTGATTACCT ATGACACTGA ATTTTTCTTT GGGGCTTCCG GCTCCCCTGT GTTTGATTCA 3840 
AAAGGTTCAT TGGTGGCCAT GCATGCTGCT GGCTTTGCTT ATACTTACCA AAATGAGACT 3900 
CGTAGTATCA TTGAGTTTGG CTCTACCATG GAATCCATCC TCCTTGATAT TAAGCAAAGA 3960 
CATAAACCAT GGTATG AAG A AGTATTTGTA AATCAGCAGG ATGTAGAAAT GATGAGTGAT 4020 
GAGGACTTGXGAGAATTCAG TCTACTGGAT TTAAGGGAAT GGCTTATGGA GTTGTTATTT 4080 
CGTAGGCATT GAAAATGGTT TTCTAAACTC CAAAATGGTC ATCTTATCAA TAATAATAAT 4140 
ATTGACCATT TOCTATCTGC CAGGCATTTT TCTA AGCACA TGAAGAAATT AGTOCTAACA 4200 
ACACTATGAG ATGGACTATA ACTTGCCCA A ATI 1 111111 TTTTTGAGAC TGAGTCTCAC 4260 
TCTGTCGCCT GGGCTGGAGT ACAGTGGTGC GATCTCAGCT CACTGCAACT TCCACCTCCC 4320 
AGGTTCAAGC GATTCTTATG CCTCAGTCTC CTGAGCAGCT GGGATTACAG GCAAACGCCA 4380 
CCACACCCAG CTAAATTTTT 1111111111 TGTATTTTTA GTAGAGACAG GGTTTCACCA 4440 
TGTTGGTCAG GCGGGTCTCG AACTCCTGAC CTCGTGATCC ACCTGCCTCG GCCTTCCAAA 4500 
GTGCTGGGAT TACAAGTTTO AGCCACTGCA CCTGGCTAAC TTGCCCTATT TTAAAGTCAA 4560 
GCAATGGGAA GAATAACAAG ATTATATAGT AATCAGTTTC ATGACACTAA AAGTCATATA 4620 
GTCATAGGGT TTTTTCATCT TTCATATCTT TGCCTAAATT CATTTGCTAC AGTGCAGGAA 4680 
CCAAAACTTG TTCATCTCAT GATTCCCTAC ATCTGACATA AGGAAAGTAA GTGCTCAGAA 4740 
AAATGTGCAG GTCAATAAGT TGCAAAAGTT GGGGCTGCAA TTAATGCTAA CATAAGAGCT 4800 
AAATGCTTGA TTAGAAATGA TCTCAAAACC TTTTAGAATT TCCAAAATCT TCATATTACT 4860 
GAAACTGTCG GAATATATGG GTOCTGAAAT TCAGAAGATG ATAGTCACTC TTCCCATATT 4920 
TATAGGCTAT TAAGGCAAGG GATATCTTAA ACATCATATT ACTTTATTTA GATTTCTACT 4980 
ACTCCAATTA TTAATGTTAT GTATTTCTCA TTGTTTTACT TCTTCATGGT ATTATGAAGA 5040 
CTATATAGAT GATTCAACCA AGCCTGCAAA TCTCCCTCTT GTGGAATTCC ACTGGACCCA 5100 
ATCTGTTTTC CATTTOCATT GCAATACTAC TAAAGCCATA CAATATCAAG CACOCTCCCT 5160 
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CTAGGTCCAG GGACTATCAC AGAAG AAGCA GGCATGTAAG ATTTTAAGGA CTGGTTTCGA 5220 
GGGGTCGAGT GTAGGAAAAC AGCCTGTTGC ATTGTA AG AG TGATGTCACC TTGAAGAGCA 5280 
GCTGGCATGA TGACTGCTGT TTGACTCCTG CATACCAAGA TATTCTGCAG CAATGTCnT 5340 
AAACAGTGCC GGTAGTACAG ATAACCCCTC ATAAAGATGC TTATCTAACC TCCCCAGTGT 5400 
TCAGGTGTTT CACAAG AAAG TCTGAGATAT GACTAGCTAC ACGTTTTGCC AAAAATGCTT 5460 
GTTATATAAA GGGTACTTTT GGGAGGGTGA GTGCCGGCAT TTAGTGGCTG CTAGAAACAT 5520 
TGCTTCTGTT TGTA AGTTCC TATTAAATGT TCTTTCTGAG AAAAAAAAAA A 



seq p ppw pr?^n swvenre; 

PBM4 Protein sequence: BAB67788 

MDTVMKQTHA DTPVDHCLSG IRKCSSTFKL KSEVNKHETA LEMQNPNLNN KECCFTFILN 60 
GNSRKLDRSV FTAYGKPSES IYSALSANDY FSERDCNQFN KNIIVYEEKT IDGHINLGMP 120 
LKCLPSDSHF KTTFGQRKSS KEDGHILRQC ENPNMECILF HWAIGRTRK KIVKINELHE 180 
KGSKLCIYAL KGBHEGALC KDGRPR5DIG EFEWKLKEGH KKIYGKQSMV DEVSGKVLEM 240 
DKKKKALQQ KDIHKKIKQN E5ATDEINHQ SUQSKKKVH KPKKDGEIKD VEHSREQILP 300 
PQDLSHYIKD KTRQTTPRIR NYYPCSLPRK YRQINSQVRR RPHLGRRYAI NLDVQKEAJN 360 
LLKNYQTLKE AIMHQYFNFK EEAQWVRKYF REEQKRMNLS PAKQFNIYKK DFGKMTANSV 420 
S VATCEQLTY YSKSVGFMQW DNNGNTGNAT CFVFNGGYIF TCRHWHLMV GKNTHPSLWP 480 
DIISKCAKVT FTYTEPCPTP DNWFSffiPWL KVSNENLDYA ILKLKENGNA FPPGLWRQB 540 
PQPSTGLIYL IGHPEGQIKK IDGCTVIFLN ERLKKYPNDC QDGLVDLYDT TSNVYCMFTQ 600 
RSFLSEVWNT HTLS YDTCFS DGSSGSPVFN ASGKLVALHT FGLFYQRGFN VHALIEFGYS 660 
MDSHjCDIKK "WESLYKSLN DEKLETYDEE KARPRPAYRR LGCFRFRSRF PHjGTGETGR 720 
EAGKDRRGH GVSETGSCSR RQGGALWVSP AQPIGFRSSW SSGAFASSNT SGNCVERWIP 780 
GRVLARRAVS KEQQNNCSTS LMRMESRGDP RATTMTQAQR FHSPKKNPED QrrMPQNRTIY 840 
VTLKAVRKEI ETHQGQEMLV RGTEGIKEYI NLGMPLSCFP EGGQWITFS QSKSKQKEDN 900 
HIFGRQDKAS TECVKFYIHA IGIGKCKRRI VKCGKLHKKG RKLCVYAFKG ETTKDALCKD 960 
GRFLSFLEND DWKLIENNDT ILESTQPVDE LEGRYFQVEV EKRMVPSAAA SQNPESEKRN 1020 
TCVLREQIVA QYPSLKRESE KUENFKKKM KVKNGETLFE LHRTTPGKVT KNSSSKWK 1080 
LLVRLSDSVG YLFWDSATTG YATCFVFKGL FILTCRHVID SIVGDGIEPS KWATEGQCV 1140 
RVTPGYEELK DKETNYFFVE PWFEIHNEEL DYAVLKLKEN GQQVPMELYN GITPVPLSGL 1200 
IHnGHPYGE KKQIDACAVI PQGQRAKKCQ ERVQSKKAES PEYVHMYTQR SFQKIVHNFD 1260 
VTTYDTEFFF GASGSPVFDS KGSLVAMHAA GFAYTYQNET RSQEFGSTM ESILLDIKQR 1320 
HKPWYEEVFV NQQDVEMMSD EDL 



SEQ ID NO^SOPBQIOWA sequence 
NuddcAcfd Accession* KNL015642 

Coding sequence: 489-2489 (underlined sequence corresponds to start and stop codon) 



1 11 21 31 41 51 

I i I ! I I 

ACATTTCAAA AAAAATACAT AGACTGATGT TTCAGACTTG TGCAGCATAA GCCTACAGGG 60 

TACGAAGAAT GAACTCTGAG AATGTTTGGA GAATGTTTCA TCATTACTAA CAGGATATTC 120 

CTCATGACAT TGCTGTCTGA TCTTTGACCA TCAGTCTGTG ACCTGCCCCT TCTCTTTACA 180 

TCCAGCCGCT CTCTGCTCCC TGCCCCAATG AACATCTGCA CTAGGCCCAA GCCTTGGAGT 240 

AATTTACCTG AAGAGTGACA CCATTGATTT TGAAACTACT GAAGAAACCC AAGACAGCTG 300 

AAAACCAGAA GGCATCTGAG GAGAATGAGA TTACTCAGCC GGGTGGATCC AGCGCCAAGC . 360 

CGGGCCTTCC CTGCCTGAAC TTTGAAGCTG TTTTGTCTCC AGACCCAGCC CTCATCCACT 420 

CAACACATTC ACTGACAAAC TCTCACGCTC ACACCGGGTC ATCTGATTGT GACATCAGTT 480 

GCAAGGG GAT GACCGAGCGC ATTCACAGCA TCAACCTTCA CAACTTCAGC AATTCCGTGC 540 

TCGAGACCCT CAACGAGCAG OGCAACCGTG GCCACTTCTG TGACGTAACG GTGCGCATCC 600 

ACGGGAGCAT GCTGCGCGCA CACCGCTGCG TGCTGGCAGC CGGCAGCCCC TTCTTCCAGG 660 

ACAAACTGCT GCTTGGCTAC AGCGACATCG AGATCCCGTC GGTGGTGTCA GTGCAGTCAG 720 

TGCAAAAGCT CATTGACTTC ATGTACAGCG GCGTGCTACG GGTCTCGCAG TCGGAAGCTC 780 

TGCAGATCCT CACGGCCGCC AGCATCCTGC AGATCAAAAC AGTCATCGAC GAGTGCACGC 840 

GCATCGTGTC ACAGAACGTG GGCGATGTGT TCCCGGGGAT CCAGGACTCG GGCCAGGACA 900 

CGCCGCGGGG CACTCCCGAG TCAGGCACGT CAGGCCAGAG CAGCGACACG GAGTCGGGCT 960 

ACCTGCAGAG CCACCCACAG CACAGCGTGG ACAGGATCTA CTCGGCACTC TACGCGTGCT 1020 

CCATGCAGAA TGGCAGCGGC GAGCGCTCTT TTTACAGCGG CGCAGTGGTC AGCCACCACG 1080 

AGACTGCGCT CGGCCTGCCC CGCGACCACC ACATGGAAGA CCCCAGCTGG ATCACACGCA 1140 

TCCATGAGCG CTCGCAGCAG ATGGAGCGCT ACCTGTCCAC CACCCCCGAG ACCACGCACT 1200 

GCCGCAAGCA GCCCCGGCCT GTGCGCATCC AGACCCTAGT GGGCAACATC CACATCAAGC 1260 

AGGAGATGGA GGACGATTAC GACTACTACG GGCAGCAAAG GGTGCAGATC CTGGAACGCA 1320 

ACGAATCCGA GGAGTGCACG GAAGACACAG ACCAGGCCGA GGGCACCGAG AGTGAGCCCA 1380 

AAGGTGAAAG CTTCGACTCG GGCGTCAGCT CCTCCATAGG CACCGAGCCT GACTCGGTGG 1440 

AGCAGCAGTT TGGGCCTGGG GCGGCGCGGG ACAGCCAGGC TGAACCCACC CAACCCGAGC 1500 

AGGCTGCAGA AGCCCCCGCT GAGGGTGGTC CGCAGACAAA CCAGCTAGAA ACAGGTGCTT 1560 

CCTCTCCGGA GAGAAGCAAT GAAGTGGAGA TGGACAGCAC TGTTATCACT GTCAGCAACA 1620 

GCTCCGACAA GAGCGTCCTA CAACAGCCTT CGGTCAACAC GTCCATCGGG CAGCCATTGC 1680 

CAAGTACCCA GCTCTACTTA CGCCAGACAG AAACCCTCAC CAGCAACCTG AGGATGCCTC 1740 

TGACCTTGAC CAGCAACACG CAGGTCATTG GCACAGCTGG CAACACCTAC CTGCCAGCCC 1800 

TCTTCACTAC CCAGCCCGCG GGCAGTGGCC CCAAGCCTTT CCTCTTCAGC CTGCCACAGC 1860 

CCCTGGCAGG CCAGCAGACC CAGTTTGTGA CAGTGTCCCA GCCCGGTCTG TCGACCTTTA 1920 

CTGCACAGCT GCCAGCGCCA CAGCCCCTGG CCTCATCCGC AGGCCACAGC ACAGCCAGTG 1980 

GGCAAGGCGA AAAAAAGCCT TATGAGTGCA CTCTCTGCAA CAAGACTTTC ACCGCCAAAC 2040 

AGAACTACGT CAAGCACATG TTCGTACACA CAGGTGAGAA GCCCCACCAA TGCAGCATCT 2100 

GTTGCCGCTC CTTCTCCTTA AAGGATTACC TTATCAAGCA CATGGTGACA CACACAGGAG 2160 
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TGAGGGCATA CCAGTGTAGT ATCTGCAACA AGCGCTTCAC CCAGAAGAGC TCCCTCAACG 2220 

TGCACATGCG CCTCCACCGG GGAGAGAAGT CCTACGAGTG CTACATCTGC AAAAAGAAGT 2280 

TCTCTCACAA GACCCTCCTG GAGCGACACG TGGCCCTGCA CAGTGCCAGC AATGGGACCC 2340 

CCCCTGCAGG CACACCCCCA GGTGCCCGCG CTGGCCCCCC AGGCGTGGTG GCCTGCACGG 2400 

AGGGGACCAC TTACGTCTGC TCCGTCTGCC CAGCAAAGTT TGACCAAATC GAGCAGTTCA 2460 

ACGACCACAT GAGGATGCAT GTGTCTGACG GATAAGTAGT ATCTTTCTCT CTTTCTTATG 2520 

AACAAAACAA AACAACAACA AAAAACAAAC AAACAAAAAA GCTATGGCAC TAGAATTTAA 2S80 

GAAATGTTTT GGTTTCATTT TTACTTTCTO T lTOgrTTT TGTTTCGTTT CATTTTGTAC 2640 

TACATGAAGA ACTGTTTTTT GCCTGCTGGT ACATTACATT TCCGGAGGCT TGGGTGAATA 2700 

ATAGTTTTCC CAGTCTCCCT CGGATGGTGG CCTTAAGGCC TGGTAGTGCT TCAAGAGGTC 2760 

CACTGGTTGG ATCTCTAGCT ACTGGCCTCT AAATACAACC CTTCTTTACA AAAAAAAAAA 2820 



PBQ1 Protein sequence: NP.0S6457 

MTERIHSINL HNFSNS VLET LNEQRNRGHF CDVTVRIHGS MLRAHRCVLA AGSPFFQDKL 60 
LLGYSDIHP SWSVQSVQK LIDFMYSGVL RVSQSEALQI LTAASDLQIK TVIDECTRIV 120 
SQNVGPVFPG IQDSGQPTPR GTPESGTSGQ SSDTBSOYLQ SHPQHSVDR1 YSALYAC SMQ 180 
NGSGERSFYS GAWSHHETA LGLPRDHHME DPSWITRME RSQQMERYLS TTPBTTHCRK 240 
QPRPVRIQTL VGNIHIKQEM EDDYDYYGQQ RVQUJERNES EBCTEDIDQA EGTESEPKGB 300 
SFDSGV5SSI GTEPDSVEQQ FGPGAARDSQ AEPTQPEQAA EAPAEGGPQT NQLETGASSP 360 
ERSNEVEMDS TVITVSNSSD KS VLQQPSVN TSIGQPLPST QLYLRQTETL TSNLRMPLTL 420 
TSNTQVIGTA GNTYLPALFT TQPAGSGPKP FLFSLPQPUV GQQTQFVTVS QPGLSTFTAQ 480 
LPAPQPLASS AGHSTASGQG EKKPYECTLC NKTFTAKQNY VKHMFVHTGB KPHQCSICWR 540 
SFSLKDYLIK HMVTHTGVRA YQCSICNKRF TQKSSLNVHM RLHRGEKSYE CY1CKKKFSH 600 
KTLLERHVAL HS ASNGTPPA GTPPGARAGP PG WACTEGT TYVCSVCPAK FDQIEQFNDH 660 
MRMHVSDG 



SEP ID N(fc 262 PBQ6DNA sequence 
Nucleic Acid Accession* AJ654187 

Coding sequence: 1 -912 (underlined sequence corresponds to start and slop codon) 

1 11 21 31 41 51 

I I I I ! I 

ATGGTGGAAG AGGAAACAGG CATATCTTAC ATGGTGGCAG ACAAGGGACA CCCTTCTACA 60 

AACTCTACCA CTTCTGCGCC GTCGTITCGA CCATATAAAA ACGACCTATG CGAACTGCGT 120 

CGGAAAACTC CCTCACGATG TAAAACGAAG ATCAGGAGCA GATTTGAAGA ATTACAAAGT 180 

GAATTGGTGC CAGTCAGCAT GTCAGAGACA GACCACATAG CCTCTACTTC CTCTGATAAA 240 

AATGTTGGGA AAACACCTGA ATTAAAGGAA GACTCATGCA ACTTGTTTTC TGGCAATGAA 300 

AGCAGCAAAT TAGAAAATGA GTCCAAACTA TTGTCATTAA ACACTGATAA AACTTTATGT 360 

CAACCTAATG AGCATAATAA TCGAATTGAA GCCCAGGAAA ATTATATTCC AGATCATGGT 420 

GGAGGTGAGG ATTCTTGTGC CAAAACAGAC ACAGGCTCAG AAAATTCTGA ACAAATAGCT 480 

AATTTTCCTA GTGGAAATTT TGCTAAACAT ATTTCAAAAA CAAATGAAAC AGAACAGAAA 540 

GTAACACAAA TATTGGTGGA ATTAAGGTCA TCTACATTTC CAGAATCAGC TAATGAAAAG 600 

ACTTATTCAG AAAGCCCCTA TGATACAGAC TGCACCAAGA AATTTATTTC AAAAATAAAG 660 

AGCGTTTCAG CATCAGAGGA TTTGTTGGAA GAAATAGAAT CTGAGCTCTT ATCTAC GGAG 720 

TTTGCAGAAC ATCGAGTACC AAATGGAATG AATAAGGGAG AACATGCATT AGTTCTGTTT 7B0 

GAAAAGTGTG TGCAAGATAA ATATTTGCAG CAGGAACATA TCATAAAAAA GGCCAGACTT 840 

GGTCTCTOTT ATTTGCCATC AAGAACCTCA ATTGACACGT TAATTCCGTT TATCCCAAAT 900 
TTATATAGAT AA 



SEQ ID N0363 P8Q6 Protein sequence: 
Protein Accession*: NP.O80170 

MEPKEATGKE NMVTKKKNLA FLRSRLYMLE RRKTDTWES S VSGDHSGTL RRSQSDRTEY 60 
NQKLQEKMTP QGECSVAETL TPEEEHHMKR MMAKREKIIK EUQTEKDYL NDLELCVREV 120 
VQPLRNKKTO RLDVDSLFSN IESVHQISAK LLSLLEEATT DVEPAMQVIG EVFLQIKGPL 180 
EDIYKIYCYH HDEAHSILES YEKEEELKEH LSHCIQSLK 



SEQ ID NO:264 PBV7 DWA sequence 
Kudeic Add Accession*: NM.014323 

Cooing sequence: 662-2725 (underlined sequence corresponds to start and stop codon) 

l 11 21 31 41 51 

I I I I I 1 

GGGCCTACTC TGCCGCCGCC GCCGCCCGCC CGCTCCAGCC GCCGCCGCCG CCGCCACCGC 60 

CCTCCAGGCT CCGGGACCCG GCCCGCGCCA CCGCCCCCGT GCGCGCCCCG CCGCCGCCGC 120 

CTTCGCCTTC GCCTTTTGTT TCCTCCGCTC CGGCGCCCCC GCCCCGGCTC GCGCTTTGCA 180 

GGGGACGCAG CGCGCGCCCC CAGCGGGCCC GGGAAAAGCC GCGGCGCGCG CGCGCGCCTG 240 

CGCGGCGGAC CCCPCCTTCT CCTCCCCGCG TGCGCGTGCC CTTCTTGGCT GCGCGCCGGC 300 

GCCGCCTGGC GGGOGGGAGG GGAGGTGGCA GGCGCGTTTG CAGGAGGGGC GCACCTCTTC 360 

GCTCGCGCAC CCCCCCGGAA GGTAGACCGG GAAGGGGAGG CGGGCGGGCG GAGAGGAGAG 420 

AGTGGCGCGC AGTCCAGCGA GGGCGGGGGT TGGCTATGTG GGGGGTGGTG CACCCCGCAG 480 

TCTAGACAGT CTGATCCGGG CTGGGGGCGT GTACACTCGG CGCACCTGCG AGACTACAGA 540 

GCCTCGGGCC GGCACGTGTG GGGAGTGTGG ACACGTCTGC TGCGCCCCGC TTCTCGCTGC 600 
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10 
15 
20 

25 
30 
35 
40 
45 
50 
55 

♦ 

60 
65 
70 
75 
80 



TGAGGGGAAG 
CATGGAGCGG 
CAGACACA6C 
CTGCGACGTG 
CGCCTGCAGC 
GGACGGGGGT 
CAGCCGGGAG 
CGCCTACACT 
CAAGTTCCTG 
CGTACAGATC 
CTCGGACTTG 
TGGCATCGCC 
TGCAGGCCAA 
ACCCCTATCC 
CCTGACTGGC 
TGGGTCCCCA 
GTTCACTGAT 
GCTGGGCTAC 
AGACCCCGAC 
CGGCAAGATC 
GAAGCCCTAC 
CCATGTGCGG 
AGGCTTCTCC 
GCCTCACAAG 
CCTGGCCTGT 



TAACCGAGGT 
TCCCCTTCCC 
CTGCGCCAGG 
GAGCTCTGAC 
GAGTGCCAAT 
TGGGGAGAAG 
GAACAAACAC 
CCCTGCCCTT 
GTTTCAGATT 
GCCCATGGGG 
GGGACTGCTG 
AGATTTTTAT 
TTCTCCCAAT 
ACTTGGTATG 
GTTTCTTTAA 
ATACCCAAAT 
TTCGTCATCC 
TACAATGCGG 
GAGTGTCCTC 
GAGCCCTGCT 
CCCACCCCAA 
TTCTCTAATT 
TGAAAGCTAT 
TATTAAACTT 
GGAAGAAATA 
TTTCAATGCT 
TCAGTTGTGT 
GACTGTATTA 



GGAGGGGGCG 
GTGAACGACG 
ACGGAGATGC 
CTCTTGCGGG 
GAGTACTTTG 
CCGGCTGATG 
CTGGAGATGC 
TCCCGCATCG 
CTGATGAGGT 
CTGGTACCCC 
GGCTTCCCTT 
GGCAGCATGC 
GCCTCTTTGC 
CCCCAACTGC 
AAGCGAGGCC 
GGGGGCCTGA 
GCCAACCOGC 
ATCGACCTTC 
GGCCCCCGAA 
TTCCGTGATG 
TCCTGCCCTG 
TCCCATGATG 
AGGCCTGATC 
TGTCAGACCT 
CATGAAGACA 
GACCACCTGA 
TTCTCCTCTG 
CAGGTCTCCA 
ACCTATGGCA 
TCCTATGGTG 
GGCTCTTTCT 
AAGTACCCAT 
ATCCAGAAGG 
GGCTCACCTT 
GTTCAGTCGG 
CCTGAAGGGA 
GGAAATGCTG 
TCATTTTTAA 
GGTCTTTAGA 
GGACAGGGGC 
TGGGAAGAAG 
CTATGATATT 
TCCCTTCCCA 
ATGCCCAACT 
CAAGAGCCCC 
TGGAGGCGAG 
ATTTCAGTTC 
ATTATTATTA 
CCCAGGTGAT 
TGTTTAGATG 
GTTTTATGCA 
GTTGGGAAOC 
CACATGTGAG 
AAAATGTTAG 



GGCAGGTGCA 
CTTCGTGCGG 
TGCACAACCT 
TAGGCGACGA 
AGTCGGTGTT 
TAGGGGGCGC 
ACACTATCAG 
TGGTGCGCTT 
CGGTTATCGA 
CTGCCCGCGC 
TGGACATGAC 
AGCCAGAGGA 
CTGTGTTACC 
TGACTTCCCC 
GGGGCCGCCC 
GGGAGGCAGG 
TCCGGCAGCA 
CTCCTCCGAG 
AGAGGAGCCG 
TGTATCATCT 
TGTGTGGGTT 
GGTCCGTGGG 
ACTTGAACGG 
GCAATGCTTC 
AGGTGCCCTG 
AGAAGCACAG 
CCTCCTACTT 
GGCACCAGGA 
ACAAAGAAGG 
ACCTCTCAGA 
CCTGCGACAT 
GCCCTGAATG 
TGCATGTCCG 
TCTCTCCTCA 
CATTTGCGTC 
AATGAGGCAG 
TGAATGCGGA 
CTGCCCCOCA 
AATAGATTTT 
AGAAAACACT 
CTGGAATTCC 
CTGGGACCTC 
TATCCTTCAA 
GTTTTTAAGG 
CTGAGCTCAG 
CATTTTCACT 
TTACGTGATT 
TTGTTATTAT 
ACAGAGCTCT 
TACCATAATT 
AAATTTTAAA 
AGGAAGGTGG 
CAAGCCCAGG 
TACATTACTC 



GCGGCCGGGC 
CCCGTCTGGC 
GAACCAGCAG 
GAGCTTCCCA 
CAGCGCCCAG 
GACGGCAGCA 
CTCCAAGGTA 
GGAGAGCTTT 
GATCTGCCAG 
CGATATAATG 
CAACGGGGCA 
GGAGGCAGCT 
TGGGGTGGAC 
ATTCCCCAGT 
AAGGAAGGCC 
CATCCTTCCA 
CGAGGCCCAG 
GCTGGGTGAG 
GACCAGGAAG 
TAACCGGCAC 
GCGGTTCAAG 
CAAGCCTTAC 
ACATATCAAG 
TTTTGCCACC 
CCAGGTGTGT 
CGAGGGGCCC 
AAAGGTCCAT 
GCCCATCCTG 
CCAGAAATGC 
TGCCAGCGAC 
GGCAGTCCCC 
TCGGAGCTTC 
GGCTCTCGGG 
GCAGAACATG 
ATCTTTAGTA 
CTGCTGTGTC 
GGGAAGTGAT 
ACCCCACTCC 
CATCTGATAT 
ACATAGGCCT 
TGGTGCTCAA 
AGTGATTTTG 
AAGAACCACA 
AAGCCAGAAG 
CCCTCTGCCT 
GCTAGGACAA 
TTAACCATTC 
TTTTTAGGAC 
TTGTAAACCG 
AACTTGGCTA 
AAATGCCAGT 
GACAGCCGGC 
TTGACCTTGT 
TA 



TAGTGGGAGG 
TGCTACACAT 
CGCAAAAACG 
GCGCACCGCG 
TTGGGCGACG 
CCAGGCGGCG 
TTTGGGGACA 
CCCGAACTCA 
GAAGTCATCA 
CTCTTTCGCC 
GCCTTGGCAG 
CGGGCGGCTG 
CGCTTGCCCA 
GTGGCATOCA 
AACCTGCTGG 
TGCGGTCTAT 
CACGGTGTCA 
AATGGGCTAC 
CAGGTGGCTT 
AAGCTGTCCC 
AGAAAAGACC 
ATCTGCCAGA 
CAGGTGCACA 
CGAGACCGTC 
GGGAAGTACT 
AGCAACTTCT 
GTTAAAACOC 
AATGGGGGAG 
TCACATCAGG 
CTGAAGACGC 
AAAAACAAAA 
TTCCGCTCTA 
GGCCCCCTGG 
TCTCTCCTCG 
GATCCTGAGG 
CCCACGGAAA 
GTTTGGGTTC 
AACTCCTTCT 
TCTGCAGAAA 
CCAAGGCAAA 
TTCTTAGTGA 
GTCCCCTCCG 
CTAGGGTCTC 
CATCCCATGG 
GGAGGGCTCC 
GCTCAGCTGT 
AACATGCTGT 
CAGTTGTAGT 
CAGTCACACA 
GTTGATTGTT 
CTGGTCAGGG 
AGGTAGGGAC 
GATGTGAATT 



GGGCGGCGGC 
ACCAGGTGAG 
GCGGGCGCTT 
CCGTGCTGGC 
GCGGAGCTGC 
GGGCCGGGGG 
TTCTGGACTT 
TGACGGCCGC 
AACAGTGCAA 
CCCCTGGGAC 
CCAACAGCAA 
GTGCAGCCAT 
TGGTGGCTGG 
GTGCCCCTCC 
ACTCAATGTT 
GTGGTAAGGT 
CCAGCCTCCA 
CCATCTCTGA 
GTGAGATCTG 
ACTCTGGGGA 
GCATGTCCTA 
GCTGTGGGAA 
CTTCTGAGCG 
TGCGCTCCCA 
TGCGGGCAGC 
GCAGTATCTG 
ACCACGGTGT 
CAGCGTTCCA 
ATCCGATTGA 
CAGAGAAGCA 
TGGAGTCTGA 
AGTCCTACTT 
GGGACCTGGG 
AGTCCTTTGG 
TTGACCAGCA 
CAACCATCTG 
TGTAGCTGAG 
CCACCACCCA 
TATCAATGAG 
ACCAGTCOCA 
CCCCAATCCT 
ACTTCTCTAG 
CACCTACTTA 



AGACCTTTCT 
TGAGGACACC 
TGGGTTTTAA 
GAATTGCTAC 
TTAGGGTTAG 
TGAAGTCTAT 
AAGTAGGGGG 
ATTGTGTACC 
GATCTGATCA 



660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1660 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 v 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 



SEQ ID HQ36S PBY7 Phrtein sequence: 
Protein Accession #: NP_1 14439 

MERVNDASCG PSGCVTYQVS RHSTEMLHNL NQQRKNGGRF CDVLLRVGDE SFPAHRAVLA 60 
ACSEYFESVF SAQLGDGGAA DGGPADVGGA TAAPGGGAGG SRELEMHTIS SKVFGDtLDF 120 
AYTSRIWRL ESFPELMTAA KFLLMRS VIE ICQEVIKQSN VQELVPPARA DIMLPRPPGT 180 
SDLGFPLDMT NGAALAANSN GIAGSMQPEE EAARAAGAAI AGQASLPVLP GVDRLPMVAG 240 
PLSPQLLTSP FPSVASSAPP LTGKRGRGRP RKANLLDSMF GSPGGLREAG ILPCGLCGKV 300 
FTDANRLRQH EAQHGVTSLQ LGYIDLPPPR LGENGLPISE DPDGPRKRSR TRKQVACHC 360 
GKIFRDVYHL NRHKLSHSGE KPYSCPVCGL RFKRKDRMSY HVRSHDGSVG KPYICQSCGK 420 
GFSRPDHLNG HIKQVHTSER PHKCQTCNAS FATRDRLRSH LACHEDKVPC QVCGKYLRAA 480 
YMADHLKKHS EGPSNFCSIC NREGQKCSHQ DPIESSDSYG DLSDASDLKT PEKQSANGSF 540 
SCDMAVPKNK MESDGEKKYP CPECGSFFRS KSYLNKfflQK VHVRALGGPL GDLGPALGSP 600 
FSPQQNM5LL ESFGPQIVQS AFASSLVDPE VDQQPMGPEG K 



g^|DHQ;266pBY9PNAgWM9nc? 
Nucleic Acid Accesstont: KM_01 2429 

Coding sequence: 174-1 385 (underlined sequence corresponds to start and slop codon) 



51 



1 11 21 31 41 

1 I I I I I 

CCCTACTCCG CCTCTCGGGA TCCTTTAAGA GGCGGGGCTT GGCTGCCAGC TCCGCGGCCC 60 
GGGCAAAAGG CTGGGACTTT ACTCCGGGTG GCGGCGAGGA CGAGTCTGTG CTCCATCAGC 120 
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TGCCGCACCC GCCGCCTCCC GCCCCCAAAC CCCATCCCCG CGGTTGAGCC ACGATGAGCG 180 

GCAGAC3TCGG CGATCTGAGC CCCAGGCAGA AGGAGGCATT GGCCAAGTTT CGGGAGAATG 240 

TCCAGGATGT GCTGCCGGCC CTGCCGAATC CAGATGACTA TTTTCTCCTG CGTTGGCTCC 300 

GAGCCAGAAG CTTCGACCTG CAGAAGTCGG AGGCCATGCT CCGGAAGCAT GTGGAGTTCC 360 

GAAAGCAAAA GGACATTGAC AACATCATTA GCTGGCAGCC TCCAGAGGTG ATCCAACAGT 420 

ATCTGTCAGG GGGTATGTGT GGCTATGACC TGGATGGCTG CCCAGTCTGG TACGACATAA 480 

TTGGACCTCT GGATGCCAAG GGTCTGCTGT TCTCAGCCTC CAAACAGGAC CTGCTGAGGA 540 

CCAAGATGCG GGAGTGTGAG CTGCTTCT G C AAGAGTGTGC CCACCAGACC ACAAAGTTGG 600 

GGAGGAAGGT GGAGACCATC ACCATAATTT ATGACTGCGA GGGGCTTGGC CTCAAGCATC 660 

TCTGGAAGCC TGCTGTGGAG GCCTATGGAG AGTTTCTCTG CATGTTTGAG GAAAATTATC 720 

CCGAAACACT GAAGCGTCTT TTTGTTGTTA AAGCCCCCAA ACTGTTTCCT GTGGCCTATA 780 

ACCTCATCAA ACCCTTCCTG AGTGAGGACA CTCGTAAGAA GATCATGGTC CTGGGAGCAA 840 

ATTGGAAGGA GGTTTTACTG AAACATATCA GCCCTGACCA GGTGCCTGTG GAGTATGGGG 900 

GCACCATGAC TGACCCTGAT GGAAACCCCA AGTGCAAATC CAAGATCAAC TACGGGGGTG 960 

ACATCCCCAG GAAGTATTAT GTGCGAGACC AGGTGAAACA GCAGTATGAA CACAGCGTGC 1020 

AGATTTCCOG TGGCTCCTCC CACCAAGTGG AGTATGAGAT CCTCTTCCCT GGCTGTGTCC 1080 

TCAGGTGGCA GTTTATGTCA GATGGAGCGG ATGTTGGTTT TGGGArTTTC CTGAAGACCA 1140 

AGATGGGAGA GAGGCAGCGG GCAGGGGAGA TGACAGAGGT GCTGCCCAAC CAGAGGTACA 1200 

ACTCCCACCT GGTCCCTGAA *GATGGGACCC TCACCTGCAG TGATCCTGGC ATCTATGTCC 1260 

TGCGGTTTGA CAACACCTAC AGCTTCATTC ATGCCAAGAA GQTCAATTTC ACTGTGGAGG 1320 

TCCTGCTTCC AGACAAAGCC TCAGAAGAGA AGATGAAACA GCTGGGGGCA GGCACCCCGA 1380 

AATAACACCT TCTCCTATAG CAjGGCCTGGC CCCCTCAGTG TCTCCCTGTC AATTTCTACC 1440 

CCTTGTAGCA GTCATTTTCG CACAACCCTG AAGCCCAAAG AAACTGGGCT GGAGGACAGA 1500 

CCTCAGGAGC TTTCATTTCA GTTAGGCAGA GGAAGAGCGA CTGCAGTGGG TCTCCGTGTC 1560 

TATCAAATAC CTAAGGAGTC CCCAGGAGCT GGCTGGCCAT CGTGATAGGA TCTGTCTGTC 1620 

CTGTAAACTG TGCCAACTTC ACCTGTCCAG GGACAGCGAA GCTGGGGGTG GCGGGGGGCA 1680 

TGTACCACAG GGTGGCAGCA GGGAAAAAAA TTAGAAAAGG GTGAAAGATT GGGACTTAAC 1740 

ACTTCAGGGA AGTCAGCTGC CGGGGAGAAA CTTGCTCCTA AATGAACACA TAAGTTTAGA 1800 

TCGCAATGAG GAGTAGCAGG GTAGCTGGTT GCTAGAGTTA CGGTGGGGAT CAGAAACTCT 1860 

TCCAAACATT TTAGCACTGA GGCTGGGGTA GCTTOTGGCT TTTCCCAGGT CTCAGGAGGT 1920 

GGCCTGAGTC AGCACACATC TTCCCACTCG GTAGACAGGC TGGCCTCTCC CTCACTTTGA 1980 

GACTTTGGCA ACTCCTGGGC CACACGGCCT GCCTCTTTGA TTACTAATGA TTGTCAGTGA 2040 

CTCAGAGCTT OCTGGGACTT CGGGTACCCA CCCGCTGTTC TCCATGCAAA CAAAGCGCCA 2100 

GGGAAATGAC CCACAGGGAT CGCAGCTGCA GGGAGGGCCA GGGAGGTTGG GGGTGGGAGT 2160 

GAATGCTAAA AGCAGATCGT CCAGTGCCCT TTTCAGTGCT ACCGGCCTCT CACCAAGCAG 2220 

TCCTCCATGT GAGCAACCCC GAGACAAAAA TGCTAAGTGG GATCAAGAGA GCAGCACTCG 2280 

GAGAGGGTGT TTGCCAGTCT GAGTGTCCCG CGGTGCCCGC CAACCCGCTT CCTGACTGAC 2340 

CTGAGCAAGG TCTTACTAAG CAGTCCCATC TCTGTGGGAG GCATGCAACG CGTGCAGGGA 2400 

GTTCAGGTGC CGGTCGGCGT AGCCAGGCCT GGAGGCCCCC CAGGCAGGAG GCCGCCCAAA 2460 

GGCGGGGCCG GCGTCTCGCA GACTAGGGGC TGGGGGCGGC CACAGACGGC CTCGAAACCA 2520 

CAGCCCTTAC CCCAATCCCA CGAGCCCCGC CAACGAACCA CAGGTGCTGG GCTTTAGAGA 2580 

ACATGGGAAG GCGGCCCCAG ACCTGGCGGG AACGCCTTTC CCTCAGAGCC AGGCCCCGGC 2640 

CCCGTCTGGG AAGCTCATCT TGCGAAGCTG AGGGAGCTCA GGGCAAAGGC CAGGCTAGCG 2700 

CGGACCGGAA GGGGCCGAGG CTGCACGGGC CTCTGCCAGA ACGCTCAGGA CATCCCGGCC 2760 
TGGGTTCACA ACGCTGTTAG GAAAATTAAC CAATGAATAA AGCAACGTTC AGTGCGCA 



SEOIDKP367PBY9 Protein sequence: 
Protein Accession*: NPJ036561 

MSGRVGDLSP RQKEALAKFR ENVQD VLPAL PNPDDYFLLR WLRARSFDLQ KSEAMLRKHV 60 
EFRXQKDIDN HSWQPPEVI QQYLSGGMCG YDLDGCPVWY DHGPLDAEG LLFS ASKQDL 120 
LRTKMRECEL LLQECAHQTT KLGRKVETTT HYDCEGLGL KHLWKPAVEA YGEFLCMFEE 180 
NYPETLKRLF WKAPKLFPV AYNLIKPFLS EDTRKKIMVL GANWKEVLLK HISPDQVPVE 240 
YGGTMTDPDG NPKCKSKINY GGDIPRKYYV RDQVKQQYEH SVQISRGSSH QVEYEILFPG 300 
CVLRWQFMSD G ADVGK5IEL KTKMGERQRA GEMTEVLPNQ RYNSHLVPED GTLTCSDPGI 360 
YVXRH5NTYS HHAKKVNFT VEVLLPDKAS EBKMKQLGAG TPK 



SEQIDWO:268PBH8DNA Se mience 
Nucleic Acid Accession* XM 009756 

Coding sequence: 301-1440 (underlined sequence corresponds to start and stopcodon) 

l 11 21 31 41 51 

I I I I I I 

GTGGGGACAG CCGAGCCGCG CCGGGCCCCT GGACGGCGTC GCCAAGGAGC TGGGATCGCA 60 

CTTGCTGCAG ACTTTGGATG GATTTGTTTT TGTGGTAGCA TCTGATGGCA AAATCATGTA 120 

TATATCCGAG ACCGCTTCTG TCCATTTAGG CTTATCCCAG GTGGAGCTCA CGGGCAACAG 180 

TATTTATGAA TACATCCATC CTTCTGACCA CGATGAGATG ACCGCTGTCC TCACGGCCCA 240 

CCAGCCGCTG CACCACCACC TGCTCCAAGG TATGAGATAG AGAGGTCGTT CTTTCTTCGA 300 

ATGAAATGTG TCTTGGCGAA AAGGAACGCG GGCCTGACCT GCAGCGGATA CAAGGTCATC 360 

CACTGCAGTG GCTACTTGAA GATCAGGCAG TATATGCTGG ACATGTCCCT GTACGACTCC 420 

TGCTACCAGA TTGTGGGGCT GGTGGCCGTG GGCCAGTCGC TGCCACCCAG TGCCATCACC 480 

GAGATCAAGC TGTACAGTAA CATGTTCATG TTCAGGGCCA GCCTTGACCT GAAGCTGATA 540 
TTCCTGGATT CCAGGGTGAC CGAGGTGACG GGGTACGAGC CGCAGGACCT GATCGAGAAG % 600 

ACCCTATACC ATCACGTGCA CGGCTGCGAC GTGTTOCACC TCCGCTACGC ACACCACCTC 660 

CTGTTGGTGA AGGGCCAGGT CACCACCAAG TACTACCGGC TGCTGTCCAA GCGGGGCGGC 720 

TGGGTGTGGG TGCAGAGCTA CGCCACOGTG GTGCACAACA GCCGCTCGTC CCGGCCCCAC 780 

TGCATCGTGA GTGTCAATTA TGTACTCACG GAGATTGAAT ACAAGGAACT TCAGCTGTCC 840 

CTGGAGCAGG TGTCCACTGC CAAGTCCCAG GACTCCTGGA GGACCGCCTT GTCTACCTCA 900 
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10 



15 



CAAGAAACTA GGAAATTAGT GAAACCCAAA AATACCAAGA TQAAGACAAA GCTGAOAACA 960 

AACCCTTACC CCCCACAGCA ATACAGCTCG TTCCAAATGG ACAAACTGGA ATGCGGCCAG 1020 

CTCGGAAACT GGAGAGCCAG TCCCCCTGCA AGCGCTGCTG CTCCTCCAGA ACTGCAGCCC 1080 

CACTCAGAAA GCAGTGACCT TCTGTACACG CCATCCTACA GCCTGOCCTT CTCCTACCAT 1140 

TACGGACACT TCCCTCTGGA CTCTCACGTC TTCAGCAGCA AAAAGCCAAT GTTGCCGGCC 1200 

AAGTTCGGGC AGCCCCAAGG ATCCCCTTGT QAQOTGGCAC GCTTTTTCCT GAGCACACTG 1260 

CCAGCCAOCG GTGAAT6CCA GTGGCATTAT GCCAACCCCC TA6T6CCTA0 CAGCTCGTCT 1320 

CCAGCTAAAA ATCCTCCAGA GCCACCGGCG AACACTGCTA GGC ACAGCCT GGTGCCAAGC 1380 

TACGAA66CA AGCAGATGTC CTCTGCGGAG ATACCGCCAG CTCCCCAG6A CGCAGACTGA 1440 
CTCCTGTTTG CTCGCTGGAC CAAC 



$eq id wg?269 pphb Prcfcin sequent 

Protein Accession t: NP_005060 



MKEKSKNAAK TRREKENGEF YELAKLLPLP SAJTSQLDKA SHRLTTSYL KMRAVFPEGL 60 
GDAWGQPSRA GPLDGVAKEL GSHLLQTLDG FVFWASDGK IMY1SETASV HUGLSQVELT 120 
GNSIYEYIHP SDHDEMTAVL TAHQPLHHHL LQEYEIERSF FLRMKCVLAK RNAGLTCSGY 180 
KVIHCSGYLK IRQYMLDMSL YDSCYQIVGL VAVGQSLPPS AITEKLYSN MFMFRASLDL 240 
20 KUFLDSRVT EVTGYEPQDL EKTLYHHVH GCDVFHLRYA HHLLLVKGQV TTKYYRLLSK 300 
RGGWVWVQSY ATWHNSRSS RPHCIVSVNY VLTEIEYKEL QLSLEQVSTA KSQDSWRTAL 360 
STSQETRKLV KPKNTKMKTK LRTNPYPPQQ YSSFQMDKLE CGQLGNWRAS PPASAAAPPE 420 
LQPHSESSDL LYTPSYSLPFSYHYGHFPLD SHVFSSKKPM LPAKFGQPQG SPCEVARFFL 480 
„ STLPASGECQ WHYANPLVPS SSSPAKNPPE PPANTARHSL VPS YEAPAAA VRRPGEDTAP 540 
25 PSFPSCGHYR EEPALGPAKA ARQAARDG AR LALARAAPEC CAPPTPEAPG APAQLPFVLL 600 
NYHRVLARRG PLGG AAPAAS GIACAPGGPE AATGALRLRH PSPAATSPPG APLPHYLGAS 660 
VIITNGR 

30 spq fP NQOT Pftf9 PNA gfflwnce; 

Nucleic Acid Accession^ AA760894 

GGCACG AGGA GAAGATGTGG CTTGCTCATO CTTGACTTCT GCCATGGTTG TGAGGCCTCC 60 
_ CCAGCCATGT GGAACTGTTT TCAGGTGCTG GTTCCATGGC TCTTCCTGAG CCGAAAATAA 120 

35 GGAAACTCCA TAGACCTTGT CCACTGGAAC TCGTTCCCAT CTACCCTCCA CTCTATCCAG 180 
GGTGATGGAT CTCTGCAGTA AGTGGAAGAG TTCTTCATGG CCCCCAAGGT TATATOCATC 240 
TAGAACTTCA GCACGTAATT TCATCTGGAA ATAGTGCCTT TGTGGATATA AGTTAGGTAA 300 
AACTGAAGAT GAGATCATAC TGGATTAGGA TGGGATCTAA ATCCAATGAA AATGTCTTCA 360 
Ar . TAAAAAACAG GAAAGAACCC ATAGAAACAC AAGG AAGAAG GTCATGTGAA GATGGAGGCA 420 
40 GAGATTGGAG GGATGCAGCC ACCGGCCCAG GAATGCCAGC AGCCACCCAG AAGCTGGAAG 480 
GAAATG AGGG ATTCTCTOCT AGAACCTTTA GAGAGRACAT GGTCCTGTGA ACAGCTTGAT 540 
TTTGGACTTG CCCATAGCTT GTATACTCTT ACTTTGGATA CAATTTTATC CAAACTTGGC 600 
TAAACAGTTT CTCAGCCTAT GGAAAATTTA AAATGGAGAA GATTCAACTC GATTCTTACA 660 
GATTCAAAGC AAGAAAATGA TGGGAACATA GGAGGAGACC AAGAAAGCCT ATAAAAAGCA 720 
45 AAAATATGAA GTGAACATTG TGGTAGCTTT AAGATGTTTA GTGTAGCTGC AGGCACCCTA 780 
TACACATGAA AAOCCCCAAG GGG AATCCCC ATATCACAGT GTAGTGTG AT ATTTGACATT 840 
YGTGATCATY TAGAGATGTA CAG AAAAGGT GAATCTGTGT TCTGTATATT CTGCCTAAGG 900 
CAAAGAAATG TTTAGCTYTC TTTAAAATAG TTCCATAATT TTTTYTAAAA AGCTTTGCTT 960 
GAAAACTGTA AGCTTOCCAT ATCTGG AGCA TTTCACTTTA AATATTTGG A TAAATATGTT 1020 
50 ATCTTCTT AC TTGGACATTT CATGTGTTTA GGGATTGTYT TYTAAATTCT TCCTAATTCA 1080 

TATAGCTGCT AACACTTCCC GCAG AGCTAA ACCATTACAG ANTATGAAAT AAAGACCCTA 1140 
TTGATTTGAA CTTAAAAAAA AAAAMAMAAA AAAAAAAAAA AAAAAAAAAT GA 

55 Nudeic Add Accession AA149579 

Codng setjuence: 1-1 363 (underlined sequeics corresponds to start and stop codon) 

1 11 21 31 41 51 

GO ATGGAATCAA TCTCTATGAT GGGAAGCCCT AAGAGCCTTA GTGAAACTTG TTTACCTAAT 60 

GGCATAAATG GTATCAAAGA TGCAAGGAAG GTCACTGTAG GTGTGATTGG AAGTGGAGAT 120 

TTTGCCAAAT CCTTGACCAT TCGACTTATT AGATGCGGCT ATCATGTGGT CATAGGAAGT 180 

AGAAATCCTA AGTTTGCTTC TGAATTTTTT CCTCATGTGG TAGATGTCAC TCATCATGAA 240 

£ - GATGCTCTCA CAAAAACAAA TATAATATTT GTTGCTATAC ACAGAGAACA TTATACCTCC 300 

65 CTCTGGGACC TGAGACATCT GCTTGTGGGT AAAATCCTGA TTGATGTGAG CAATAACATG 360 

AGGATAAACC AGTACCCAGA ATCCAATGCT GAATATTTGG CTTCATTATT CCCAGATTCT 420 

TTGATTGTCA AAGGATTTAA TGTTGTCTCA GCTTGGGCAC TTCAGTTAGG ACCTAAGGAT 480 

GCCAGCCGGC AGGTTTATAT ATGCAGCAAC AATATTCAAG CGCGACAACA GGTTATTGAA 540 

nri CTTGCCCGCC AGTTGAATTT CATTCCCATT GACTTGGGAT CCTTATCATC AGCCAGAGAG 600 

70 ATTGAAAATT TACCCCTACG ACTCTTTACT CTCTGGAGAG GGCCAGTGGT GGTAGCTATA 660 

AGCTTGGCCA CATTTTTTTT CCTTTATTCC TTTGTCAGAG ATGTGATTCA TCCATATGCT 720 

AGAAACCAAC AGAGTGACTT TTACAAAATT CCTATAGAGA TTGTGAATAA AACCTTACCT 780 

ATAGTTGCCA TTACTTTGCT CTCCCTAGTA TACCTCGCAG GTCTTCTGGC AGCTGCTTAT 840 

_ CAACTTTATT ACGGCACCAA GTATAGGAGA TTTCCACCTT GGTTGGAAAC CTGGTTACAG 900 

75 TGTAGAAAAC AGCTTGGATT ACTAAGTTTT TTCTTCGCTA TGGTCCATGT TGCCTACAGC 960 

CTCTGCTTAC CGATGAGAAG GTCAGAGAGA TATTTGTTTC TCAACATGGC TTATCAGCAG 1020 

GTTCATGCAA ATATTGAAAA CTCTTGGAAT GAGGAAGAAG TTTGGAGAAT TGAAATGTAT 1080 

ATCTCCTTTG GCATAATGAG CCTTGGCTTA CTTTCCCTCC TGGCAGTCAC TTCTATCCCT 1140 

TCAGTGAGCA ATGCTTTAAA CTGGAGAGAA TTCAGTTTTA TTCAGTCTAC ACTTGGATAT 1200 
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GTCGCTCTGC TCATAAGTAC TTTCCATGTT TTAATTTATG GATGGAAACQ AGCTTTTGAO 1260 
GAAGAGTACT ACAGATTTTA TACACCACCA AACTTTGTTC TTGCTCTTGT TTTGCCCTCA 1320 
ATTQTAATTC TGGATCTTTT GCAGCTTTGC AGATACCCAG A CTGA 

SEQ IP NQffTC PPQ4 Protein swims 

Protein Accession!: none 



1 11 21 31 41 51 

I I I I I I 

MBSISMMGSP KSLSETCLPN 6INGIKDARK VTVGVIGSGD FAKSLTIRLI RCGYHWIGS 60 
RNPKFASEPF PHWDVTHHE DALTKTOIIF VAIHREHYTS LWDLRHLLVG KILH7VSNNM 120 
RINQYPESNA EYLASLFPDS LIVKGFNWS AWALQLGFKD ASRQVYICSN NIQARQQVIE 180 
LARQLNFIPI DLGSLSSAHE IEMLPLRLFT LWRGPWVAI SLATFFFLYS PVRDV3BPYA 240 
RNQQSDFYKI PIEIVNKTLP IVAITLLSLV YLAGLIAAAY QLYYGTKYRR FPPWLETWLQ 300 
CRKQLGLLSF FFAMVHVAYS LCLPMRRSER YLFLNMAYQQ VHANIENSWN EEEVWRIEMY 360 
ISFGIMSLGL LSLLAVTSIP SVSNALNWRE FSPIQSTLGY VALLISTFHV LIYGWKRAFE 420 
EEYYRFYTPP NFVLALVLPS IVILDLLQLC RYPD 

SEQ ID K0.-273 PBQ5 DNA SEQUENCE 

Nucleic Acid Accession*: NM.001973 

Cooing sequence: 150-1 445 (underlined sequence corresponds to start and stop codon) 



1 11 21 31 41 51 

I I I I I I 

CCGCCGCCTT CTACTCCGCC GCGGGGGTCG CAGCGGCTGC CGCGCCGTCC TCGAGTTTCC 60 

AGCGTGAGGA GGAGGCTGAG GGCGGAGAGG CGCATCGTGT TCGAGGCGGA GACCGAGGGG 120 

GAGCCCCGOG CGCGGCGTOG CTCATTGCTA TGG ACAGTGC TATCACCCTG TGGCAGTTCC 180 

TTCTTCAGCT CCTGCAGAAG CCTCAGAACA AGCACATGAT CTGTTGGACC TCTAATGATG 240 

GGCAGTTTAA GCTTTTGCAG GCAGAAGAGG TGGCTCGTCT CTGGGGGATT CGCAAGAACA 300 

AGCCTAACAT GAATTATGAC AAACTCAGCC GAGCCCTCAG ATACTATTAT GTAAAGAATA 360 

TCATCAAAAA AGTGAATGGT CAGAAGTTTG TGTACAAGTT TGTCTCTTAT CCAGAGATTT 420 

TGAACATGGA TCCAATGACA GTGGGCAGGA TTGAGGGTGA CTGTGAAAGT TTAAACTTCA 480 

GTGAAGTCAG CAGCAGTTCC AAAGATGTGG AGAATGGAGG GAAAGATAAA GCACCTCAGC 540 

CTGGTGCCAA GACCTCTAGC CGCAATGACT ACATACACTC TGGCTTATAT TCTTCATTTA 600 

CTCTCAACTC TTTGAACTCC TCCAATGTAA AGCTTTTCAA ATTGATAAAG ACTGAGAATC 660 

CAGCCGAGAA ACTGGCAGAG AAAAAATCTC CTCAGGAGCC CACACCATCT GTCATCAAAT 720 

TTGTCACGAC ACCTTCCAAA AAGCCACCAG TTGAACCTGT TGCTGCCACC ATTTCAATTG 780 

GCCCAAGTAT TTCTCCATCT TCAGAAGAAA CTATCCAAGC TTTGGAGACA TTGGTTTCCC 640 

CAAAACTGCC TTCCCTGGAA GCCCCAACCT CTGCCTCTAA CGTAATGACT GCTTITGCCA 900 

CCACACCACC CATTTCGTCC ATACCCCCTT TGCAGGAACC TCCCAGAACA CCTTCACCAC 960 

CACTGAGTTC TCACCCAGAC ATCGACACAG ACATTGATTC AGTGGCTTCT CAGCCAATGG 1020 

AACTTCCAGA GAATTTGTCT CTGGAGCCTA AAGACCAGGA TTCAGTCTTG CTAGAAAAGG 1080 

ACAAAGTAAA TAATTCATCA AGATCCAAGA AACCCAAAGG GTTAGGACTG GCACCCACCC 1140 

TTGTGATCAC GAGCAGTGAT CCAAGCCCAC TGGGAATACT GAGCCCATCT CTCCCTACAG 1200 

CTTCTCTTAC ACCAGCATTT TTTTCACAGA CACCCATCAT ACTGACTCCA AGCCCCTTGC 1260 

TCTCCAGTAT CCACTTCTGG AGTACTCTCA GTCCTGTTGC TCCCCTAAGT CCAGCCAGAC 1320 

TGCAAGGTGC TAACACACTT TTCCAGTTTC CTTCTGTACT GAACAGTCAT GGGCCAJTCA 1380 

CTCTGTCTGG GCTGGATGGA CCTTCCACCC CTGGCCCATT TTCCCCAGAC CTACAGAAGA 1440 

CATAACCTAT GCACTTGTGG AATGAGAGAA CCGAGGAACG AAGAAACAGA CATTCAACAT 1500 

GATTGCATTT GAAGTGAGCA ATTGATAGTT CTACAATGCT GATAATAGAC TATTGTGATT 1560 

TTTGCCATTC CCCATTGAAA ACATCTTTTT AGGATTCTCT TTGAATAGGA CTCAAGTTGG 1620 

ACTATATGTA TAAAAATGCC TTAATTGGAG TCTAAACTCC AOCTCCCTCT GTCTTTTCCT 1680 

TTTCTTTTTC TTTOCTTCCT TCCTTTTCTT TTCTCCTTTA AAAATATTTT GAGCTTTGTG 1740 

CTGAAGAAGT TTTTGGTGGG CTTTAGTGAC TGTGCTTTGC AAAAGCAATT AAGAACAAAG 1800 

TTACTCCTTC TGGCTATTGG GAOCCTTTGG CCAGGAAAAA TTATGCTTAG AATCTATTAT 1860 

TTAAAGAAGT ATTTGTGAAA TGAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 1920 
AAAAAAAAAA AAA 



SEQ ID NO:274 PBQ5 Protein —mme* 
Protein Accession*: NP.0018B4 

MDS AJTLWQF LLQLLQKPQN KHM1CWTSND GQFKLLQAEE VARLWGRKN KPNMNYDKLS 60 
RALRYYYVKN IIKKVNGQKF VYKFVSYPH LNMDPMTVGR EGDCESLNF SEVSSSS KDV 120 
ENGGKDKPPQ PGAKTSSRND YIHSGLYSSF TLNSLNSSNV KLFKUKTEN PAEKLAEKKS 180 
PQEPTPSV1K FVTTPSKKPP VEPVAATISI GPSISPSSEE TIQALETLVS PKLPS LEAPT 240 
SASNVMTAFA TTPPISSIPP LQEPPRTPSP PLSSHPDIDT DIDSVASQPM ELPENLSLEP 300 
KDQDSVLLEK DKVNNSSRSK KPKGLGLAPT LVITSSDPSP LGILSPSLPT ASLTPAFFSQ 360 
TPIILTPSPL LSSIHFWSTL SPVAPLSPAR LQGANTLFQF PSVLNSHGPF TLSGLDGPST 420 
PGPFSPDLQK T 



SEQ ID N0275 PBY3 DNA SEQUENCE 

Nucleic Add Accession*: AB040321 

Coding sequence: 131-2560 {underfined sequence corresponds to start and stop codon) 

1 11 21 31 41 51 

I I I I I I 

416 



WO 02/30268 



AATCAGGAAC AGATCATATA TTGACCGAGA TTCTGAGTAT CTCTTGCAAG AAAAT6AACC 60 

AGATGGAACT TTAGACCAAA AATTATTGGA AGATTTACAA AAGAAAAAAA ATGACCTTCG 120 

GTATATTGAA ATGCAGCATT TCAGAGAAAA GCTGCCTTCG TATGGAATGC AAAAGGAATT 180 

GGTAAATTTA ATTGATAACC ATCAGGTAAC AGTAATAAGT GGTGAAACTG GTTGTG6CAA 240 

AACCACTCAA GTTACTCAGT TCATTTTGGA TAACTACATT GAAAGAGGAA AAGGATCTGC 300 

TTGCAGAATA GTTTGTACTC AGCCAAGAAG AATTA6T6CC ATTTCAGTTG CGGAAAGAGT 360 

AGCTGCAGAA AGGGCAGAAT CTTGTGGCAG TGGTAATAGT ACTGGATATC AAATTCGTCT 420 

CCAGAGTCGG TTGCCAAGGA AACAGGGTTC TATCTTATAC TGTACAACAG GAATCATCCT 480 

TCAGTGGCTC CAGTCAGACC CGTATTTGTC CAGTGTTAGT CATATCGTAC TTGATGAAAT 540 

CCATGAAAGA AATCTGCAGT CAGATGTTTT AATGACTGTT GTTAAAGACC TTCTCAATTT 600 

TCGATCTGAC TTGAAAGTAA TATTGATGAG TGCAACATTG AATGCAGAAA AGTTTTCAGA 660 

ATATTTTGGT AACTGTCCAA TGATACATAT ACCTGGTTTT ACCTTTCCGG TTGTGGAATA 720 

TCTTTTGGAA GATGTAATTG AAAAAATAAG GTATGTTCCA GAACAAAAAG AACACAGATC 780 

CCAGTTTAAG AGGGGTTTCA TGCAAGGGCA TGTAAATAGA CAAGAAAAAG AAGAAAAAGA 840 

AGCAATATAT AAAGAACGTT GGCCAGATTA TGTAAGGGAA CTGCGAAGAA GGTATTCTGC 900 

AAGTACTGTA GATGTTATAG AAATGATGGA GGATGATAAA GTTGATCTGA ATTTGATTGT 960 

TGCCCTCATC CGATACATTG TTTTGGAAGA AGAGGATGGT GCGATACTGG TCTTTCTGCC 1020 

AGGCTGGGAC AATATCAGCA CTTTACATGA TCTCTTGATG TCACAAGTAA TGTTTAAATC 1080 

AGATAAATTT TTAATTATAC CTTTACATTC ACTOATGCCT ACAGTTAACC AGACACAGGT 1140 

GTTTAAAAGA ACCCCTCCTG GTGTTCGGAA AATAGTAATT GCTACCAACA TTGCGGAGAC 1200 

TAGCATTACC ATAGATGATG TCGTTTATGT GATAGATGGA GGAAAAATAA AAGAGACGCA 1260 

TTTTGATACT CAGAACAATA TCAGTACAAT GTCCGCTGAG TGGGTTAGTA AAGCTAATGC 1320 

CAAACAGAGA AAAGGTCGAG CTGGAAGAGT TCAACCTGGT CATTGCTATC ATCTGTATAA 1380 

TGGTCTTAGA GCAAGTCTTC TAGATGACTA TCAACTGCCA GAAATTTTGA GAACTCCTTT 1440 

GGAAGAACTT TGTTTACAAA TAAAGATTTT AAGGCTAGGT GGAATTGCTT ATTTTCTGAG 1500 

TAGATTAATG GACCCACCAT CAAATGAGGC AGTGTTACTC TCCATAAGAC ACCTGATGGA 1560 

GCTGAACGCT TTGGATAAAC AAGAAGAATT GACACCTCTT GGAGTCCACT TGGCACGATT 1620 

ACCCGTTGAG CCACATATTG GAAAAATGAT TCTTTTTGGA GCACTGTTCT GCTGCTTAGA 1680 

CCCAGTACTC ACTATTGCTG CTAGTCTCAG TTTCAAAGAT CCATTTGTCA TTCCACTGGG 1740 

AAAAGAAAAG ATTGCAGATG CAAGAAGAAA GGAATTGGCA AAGGATACTA GAAGTGATCA 1800 

CTTAACAGTT GTGAATGCGT TTGAGGGCTG GGAAGAGGCT AGGCGACGTG GTTTCAGATA 1860 

CGAAAAGGAC TATTGCTGGG AATATTTTCT GTCTTCAAAC ACACTGCAGA TGCTGCATAA 1920 

CATGAAAGGA CAGTTTGCTG AGCATCTTCT TGGAGCTGGA TTTGTAAGCA GTAGAAATCC 1980 

TAAAGATCCA GAATCTAATA TAAATTCAGA TAATGAGAAG ATAATTAAAG CTGTCATCTG 2040 

TGCTGGTTTA TATCCCAAAG TTGCTAAAAT TCGACTAAAT TTGGGTAAAA AAAGAAAAAT 2100 

GGTAAAAGTT TACACAAAAA CCGATGGCCT G6TTGCTGTT CATCCTAAAT CTGTTAATGT 2160 

GGAGCAAACA GACTTTCACT ACAACTGGCT TATCTATCAC CTAAAGATGA GAACAAGCAG 2220 

TATATACTTG TATGACTGCA CAGAGGTTTC CCCATACTGT CTCTTGTTTT TTGGAGGTGA 2280 

CATTTCCATC CAGAAGGATA ACGATCAGGA AACTATTGCT GTAGATGAGT GGATTGTATT 2340 

TCAGTCTCCA GCAAGAATTG CCCATCTTGT TAAGGAATTA AGAAAGGAAC TAGATATTCT 2400 

TCTGCAAGAG AAGATTGAAA GTCCTCATOC TGTAGACTGG AATGACACTA AATCCAGAGA 2460 

CTGTGCAGTA CTGTCAGCTA TTATAGACTT GATCAAAACA CAGGAAAAGG CAACTCCCAG 2520 

GAACTTTCCG CCACGATTCC AGGATGGATA TTACAGCTGA CAGCTTTTCA GGGGTGGTCT 2580 

GAAAAGCCAG TTTGACAGCC ATTCTTCATC ATTGTTTAAA TTTTGGCTGG ATGCCAAACC 2640 

CTGGGACATG AACAATTTTC ATGTGTAAGG TAGAAGCCTT CAGTAGGTAG TAAAGACTTA 2700 

ATGTGCATGA CTTGATGTTA TATGTAGAGA TATATATATA TATATATATA CCAIAAAAGC 2760 

AATATGTTCT CTGATCATAT ACTCTGCTGT GGTCATGCCC ACTCTTTGGG AGTATATTCC 2820 

CTTTATATAT ATTGAGTATT GTACCACTTG AGAAATTCCT TTGTTCTGTT ATACAAAATT 2880 

AATCTTTCTG CTCATAATGA TTGATGATAC CACCAGTAAA AATAGGATGT TTACCCCAAA 2940 

ACAAGTGTCA ATTAAGAATT TGAACACAAC CACATTTTTT AAAATGAAAC TTCTATCGGA 3000 
AGTAAATTAA TTTGTTGTAA TAAAGTCCAG TATTTAATAA AATGTACAAT GTTAAATCTC 

SEQ ID NO:276 PBY3 Protein sequence: 
Protein Accession* BAA96012 

IRNRSYIDRD SEYLLQENEP DGTLDQKLLE DLQKKKNDLR YIEMQHFREK LPSYGMQKEL 60 
VNUDNHQVT V1SGETGCGK TTQVTQFILD NYIERGKGSA OUVCTQPRR ISAISVAERV 120 
AAERAESCGS GNSTGYQIRL QSRLPRKQGS ILYCTTGUL QWLQSDPYLS SVSHTVLDEI 180 
HERNLQSDVL MTWKDLLNF RSDUCVILMS ATLNAEKFSE YFGNCPMIHI PGFTFPWEY 240 
LLEDVEKIR YVPEQKEHRS QFKRGFMQGH VNRQEKEEKE AJYKERWFDY VRELRRRYS A 300 
STVDVIEMME DDKVDLNUV ALIRYTVLEE EDGAILVFLP GWDNISTLHD LLMSQ VMFKS 360 
DKFLUPLHS LMFTVNQTQV FKRTPPGVRK IVIATNIAET SITIDDWYV IDGGKIKETH 420 
FDTQNNISTM SAEWVSKANA KQRKGRAGRV QPGHCYHLYN GLRASLLDDY QLPHLRTPL 480 
EELCLQIKIL RLGGIAYFLS RLMDPPSNEA VLLSIRHLME LNALDKQEEL TPLGVHLARL 540 
PVEPfflGKMl LFGALFCCLD PVLTIAASLS FKDPFVIPLG KEKIADARRK ELAKDTRSDH 600 
LTWNAFEGW EEARRRGFRY EKDYCWEYFL SSNTLQMLHN MKGQFAEHLL GAGFVSSRNP 660 
KDPESNINSD NEKDKAVIC AGLYPKVAKI RLNLGKKRKM VKVYTKTDGL VAVHPKSVNV 720 
EOTDFHYNWL IYHLKMRTSS IYLYDCTEVS PYCLLFFGGD ISIQKDNDQE TIAVDEWIVF 780 
QSPARIAHLV KELRKELDIL LQEKESPHP VDWNDTKSRD CAVLSAIIDL IKTQEKATPR 840 
NFPPRFQDGY YS 



GATTTTATCC TGGAACATTA CAGTGAAGAT GGCTATTTAT ATGAAGATGA AATTGCAGAT 60 
CTTAJJQGATC TGAGACAAGC TTGTCGG ACG CCTAGCCGGG ATG AGGCCGG GGTGG AACTG 120 



Nudeic Acid Accession*: 
CodInQ sequence: 
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CTGATGACAT ACTTCATCCA GCTCGGCTTT GTCGAG AGTC GATTCTTCCC GCCCACACXK5 180 
CAGATGGGAC TCCTQTTCAC CTGGTATGAC TCTCTCACCG GGGTTCCGGT CAGCCAGCAG 240 
AACCTCCTGC TGGAOAAGGC CAGTGTCCTO TTCAACACTO GGGCCCTCTA CACCCAGATT 300 
GGGACCCGGTGTGATCGGCA GACGCAGGCT GGGCTGGAGA GTGCCATAGA TGCCTTTCAG 360 
AG AGCCGCAG GGGTTTTA AA TTACCTG AAA GACACATTTA COCATACTOC AAGTTAOGAC 420 
ATGAGCCCTG CCATGCTCAG CGTGCTCGTC AAAATGATGC TTGCACAAGC CCAAGAAAGC 480 
GTGTTTGAGA AAATCAGCCT TCCTGGGATC CGGAATG AAT TCTTCATGCT GGTGAAGGTG 540 
GCTCAGGAGG CTGCTAAGGT GGGAG AGGTC TAOCAACAGC TACACGCAGC CATGAGCCAG 600 
GCGOCGGTGA AAGAGAACAT COGCTACTOC TGGGCCAGCT TAGCCTGCGT GAAGGCOCAC 660 
CACTACGCGG CCCTGGCCCA CTACTTCACT GCCATCCTCC TCATCGACCA CCAGGTGAAG 720 
CCAGGC ACGG ATCTGGACCA CCAGGAGAAG TGCCTGTOCC AGCTCTACGA CCACATGCCA 780 
GAGGGGCTGA CACCCTTGGC CACACTGAAG AATGATCAGC AGCGCCGACA GCTGGGGAAG 840 
TCCCACTTGC GCAG AGCCAT GGCTCATCAC GAGGAGTCGG TGCGGOAGGC CAGCCTCTGC 900 
AAGAAGCTGC GGAGCATTGA GGTGCTACAG AAGGTGCTGT GTGCCGCACA GGAACGCTCC 960 
CGGCTCACGT ACGCCCAGCA CCAGGAGGAG GATGACCTGC TGAACCTGAT CGACGCCCCC 1020 
AGTGTTGTTG CTAAAACTGA GCAAGAGGTT GACATTATAT TGCCCCAGTT CTCCAAGCTG 1080 
ACAGTCACGG ACTTCTTCCA OAAGCTGGGC CCCTTATCTG TGTTTTCGGC TAACAAGCGG 1 140 
TGGACGCCTC CTCGAAGCAT CCGCTTCACT GCAG AAGAAG GGGACTTGGG GTTCACCTTG 1200 
AGAGGGAACG CCCCCGTTCA GGTTCACTTC CTGGATCCTT ACTGCTCTGC CTCGGTGGCA 1260 
GGAGCCCGGG AAGGAGATTA TATTGTCTCC ATTCAGCTTG TGGATTGTAA GTGGCTGACG 1320 
CTGAGTGAGG TTATGAAGCT GCTGAAGAGC TTTGGCGAGG ACGAGATOGA GATGAAAGTC 1380 
GTGAGCCTCC TGGACTCCAC ATCATOCATG CATAATAAGA GTGCCACATA CTCCGTGGGA 1440 
ATGCAGAAAA CGTACTCCAT GATCTGCTTA GOCATTGATG ATGACGACAA AACTGATAAA 1500 
ACCAAGAAAA TCTCCAAGAA GCTTTCCTTC CTGAGTTGGG GCACCAACAA GAACAGACAG. 1560 
AAGTCAGCCA GCACCTTGTG CCTCCCATCG GTCGGGGCTG CACGGCCTCA GGTCAAGAAG 1620 
AAGCTGCCCT CCCCTTTCAG CCTTCTCAAC TCAGACAGTT CTTGGTACIA_A 



£Efl |B nct ppyg Proton xmw, 
Proton Accession*: NPJ49094 

DFILEHYSED GYLYEDEIAD LMDLRQACRT PSRDEAGVEL LMTYHQLGF VESRFFPPTR 60 
QMGLLFTWYD SLTGVPVSQQ NLLLEKASVL FNTGALYTQ1 GTRCDRQTQA GLESAIDAPQ 120 
RAAGVLNYLK DTFTHTPSYD MSPAMLSVLV KMMLAQAQES VFEKISLPGI RNEFFMLVKV 180 
AQEAAKVGEV YQQLHAAMSQ APVKENIPYS WASLACVKAH HYAALAHYFT AHJJDHQVK 240 
PGTDLDHQEK CLSQLYDHMP EGLTPLATLK NDQQRRQLGK SHLRRAMAHH EES VREASLC 300 
KKLRSIEVLQ KVLCAAQERS RLTYAQHQEE DDLLNUDAP SWAKTEQEV DDLPQFSKL 360 
TVTDFFQKLG PLSVFSANKR WTPPRSIRFT AEEGDLGFTL RGNAPVQVHF LDPYCS ASV A 420 
GAREGDYIVS IQLVDCKWLT LSEVMKLLKS FGEDEIEMKV VSLLDSTSSM HNKSATYSVG 480 
MQKTYSM1CL AIDDDDKTDK TKKISKKLSF LSWGTNKNRQ KSASTLCLPS VGAARPQVKK 540 
KLPSPFSIXN SDSSWY 



SEQ ID H0m PBY8 DNA SEQUENCE 

Nucleic Acid Accession*: AF107493 

Coding sequence: 125-556 (underlined science corresponds to start and stop codon) 

X 11 21 31 41 51 

I I I I I I 

GAATTCGGCA CGAfiCCTTGT TGGAGGTTCT GGGGCGCAGA ACCGCTACTG CTGCTTCGQT 60 

CTCTCCTTGG GAAAAAAIAA AATTTGAACC TTTTGGAGCT GTGTGCTAAA TCTTCAGTGG 120 

GACAATGGGT TCAGACAAAA GAGTGAGTAG AACAGAGCGT AGTGGAAGAT ACGGTTCCAT 180 

CATAGACAGG GATGACCGTG ATGAGCGTGA AtCCCGAAGC AGGCGGAGGG ACTCAGATTA 240 

CAAAAGATCT AGTGATGATC GGAGGGGTGA TAGATATGAT GACTACCGAG ACTATGACAG 300 

TCCAGAGAGA GAGCGTGAAA GAAGGAACAG TGACCGATCC GAAGATGGCT ACCATTCAGA 360 

TGGTGACTAT GGTGAGCACG ACTATAGGCA TGACATCAGT GACGAGAGGG AGAGCAAGAC 420 

CATCATGCTG CGCGGCCTTC CCATCACCAT CACAGAGAGC GATATTCGAG AAATGATGGA 480 

GTCCTTCGAA GGCCCTCAGC CTGCGGATGT GAGGCTGATG AAGAGGAAAA CAGGTGAGAG 540 

CTTGCTTAGT TCCTGATATT ATTGTTCTCT TCCCCATTCC CACCTCAGTC CCTAAAGAAC 600 

ATCCTGATTC CCCCAGTCTT CAAGCACATG AATTCAGAAT GAAAGGTTTG CCATGGCTAA 660 

GGAATGTGAC TCTTTGAAAA CCATGTTAGC ATCTGAGGAA CTTTTTTAAA CTTTOTTW ^ 720 

GGGACTTTTT TTTCCTTAGG TAAGTAATGA TTTATAAACT CCTTTTTTTT TTTGACTATA 780 

GTCGGTTGCA TGGTTACTTT AAGCGTGGAA TCAAATGGAG TGGCATTTAG TTCAGGCGGC 840 

TTGTTCCTTG CCATGGCAAA GTATCAAGAA GATCCCCAAG TCAAGTCACA TTTGTAAAGC 900 

TGCTTCCCAA TTGGCTTTGT CACGCAGTGT TGAAGCAGTG GGAGAGAGAT TCACCTGTTA 960 

TAAAGGAACT GACTAACACA AGTATCCCGT CTATATCTGA ATGCTGTCTC TAGGTGTAAG 1020 

CCGTGGTTTC GCCTTCGTGG AGTTTTATCA CTTGCAAGAT GCTACCAGCT GGATGGAAGC 1080 

CAATCAGGTT GCTTCACTCA CCAAGTCTAG ATATTCATGA AAATGGAACA AGTCTGTACA 1140 

ATTTTAAAAA AAGGTTGAAG GAGTGGTTTG TTCCAAAGGA GTGACTTTTT TTTAAAAAAA 1200 

AAGCTTTGTA TATATTAAAA TTGATGTTAC TAGAATAAGT ACAGTACCAA GGACTTCATT 1260 

ATAGAATTTG TTCTGCCTTT AAACATGGCT ACCTACCTGG CAGGGCTTTG TTAACTACTG 1320 

AATACCTGTC TGGTAATCAC TAAAACATCT TTATGTTTCC CTTTTTTCTA GTTTGTTATA 1380 

TTCCTATTAT GTCCATTGAG AGTAAGCTTA GTATATCAAA CTCTCCATTT GACAGTGAAG 1440 

AGAACATAGT GAAAGTCTGT GGCGGCATTT TTATAAGTAA TTCCTTATTT CTGCCTGAAG 1500 

ACCACAAAGC CTCCTGGAGG CGTAACTGCT CAGACCGGTC TTCAGGGAAT ATTTAAGGAC 1560 

TTAGTGGAAT TTATGAACAA TAAGTCTGAT GAGATTAGCC TGGGAGTGGT GTCCTGCAGC 1620 

TGTCTAATCT AGAGTGGCAT TAACATTCTA ATCTCCTTGA GAATGCCTTT TATAGTCTGT 1680 

TCAAAGCAAG TCATTGATGG TTCTTCGAGG TAGTGTTAAC TGAAGTGTTC TTCAGTTTGT 1740 

CAAGATAATG TTCAGTGCTT GGCACTTAAA TAACATTTTT TGCAAGAACT CCAAGGCACA 1800 
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TTATTGAATG CCTTTAACCA AGTGCATTCT GGGAAGTTTG CTTGACTCAT TATCTTGCTT 1860 

TTCTGCAGCA TTCTGTGATT TGAGTCATCC ATGAATCCAT GAATAAAAGT TACATTCTTT 1920 

CATTGGTAAT ATTGCCATTT ATAACAAGAC TCACTAATGA GGGTATCACT TTGACTGACT 1980 

GATTTGTTAA AGTTTTTAAG CCTCTCATTT TCCTAACCCA GAAATCACAG CCTGATTTTA 2040 

TTAAAAGTAG AGCTTCATTC ATTTCATACC ATAGATACCA TCCTAGTAAA TCCAGAACAT 2100 

ATACAA6GTT CATGTGAGTC TGCTTTCTTG ACAT6ATA6C ATTGTTTGAT GCA6T66ATA 2160 

TGTCAGAATG ACTAACCTAG GAGTTTGAAA CTCCTAAGAA ACTAAAACCT GTAAGACATT 2220 

TAAAAGTCTC CACAATTTTA ATGTATACAA A6CTATGTTA CTGTGTAACA CATTACAGTT 2280 

CAAATTCACT CCAGAAATAA AAGGCCAGTA GGATTAGGGA CTCACTGGTA GTTTGGAGTC 2340 

TCCCAGCACA CATCCCTCCT AGTGGGATGA TCTATTCACA TATCTCCCAG CTTTTTTATT 2400 

TTTGCTTCTG TATATCACAG TGAGTGGATG GCCCTTCAGC TTTTTCTCTC CTGGCCAGAC 2460 

ATGCAGTCTT GCCTTTAGAT ATCGCAGAGA CAAAATTCAC AGCATGTCTT AAATCTTCCA 2520 

GGATTTGCAA GAACCAAATT GCTCAACAGT ATGTATGTTT AGAGGGGTTA GACTCCTTTT 2580 

TAAAATCTGG ATATCTAACC ACCTACTTAA ATCTGTTTGA TAGTGTCAAA CCACCCCCAC 2640, 
CCTTGATCCT CCCACCCCCA AAAAAAAAAA AAAA 



$EQ ID Nft280 PPYB Protein. I MWPCft 
Prolan Accession •: XP_003261 

MGSDKRVSRT ERSGRYGSII DRDDRDERES RSRRRDSDYK RSSDDRRGDR YDDYRDYDSP 60 
ERERERRNSD RSEDGYHSDG DYGEHDYRHD ISDERESKTI MLRGLPITTT ESDIREMMES 120 
FEGPQPADVR LMKRKTGESL LSS 



SEQ ID N0281 PC12 DMA SEQUENCE 

Nucleic Add Accession*: AF208291 

Coding sequence: 1 QW705 (underlined sequence corresponds to start and stop codon) 

1 11 21 31 41 51 

I II i I ! 

CGGCCGCTTT TTTCTCAAGA TGGCAGATTC CCACTGAGGC TGAGGGGGCC GAGCTCGCGC 60 

GCCGCGTTCC CTTCTCCGTT GCCATGAACC GCGGACACCC CGGCCCC GAT GG CCCCCGTO 120 

TACGAAGGTA TGGCCTCACA TGTGCAAGTT TTCTCCCCTC ACACCCTTCA ATCAAGTGCC 180 

TTCTGTAGTG TGAAGAAACT AAAAGTAGAG CCAAGTTCCA ACTGGGACAT GACTGGGTAC 240 

GGCTCCCACA GCAAAGTGTA CAGCCAGAGC AAGAACATAC CACCTTCTCA GCCAGCCTCC 300 

ACAACCGTCA GCACCTCCTT GCCGOTCCCA AACCCAAGCC TACCTTACGA GCAGACCATC 360 

GTCTTCCCAG GAAGCACCGG GCACATCGTG GTCACCTCAG CAAGCAGCAC TTCTGTCACC 420 

GGGCAAGTCC TCGGCGGACC ACACAACCTA ATGCGTCGAA GCACTGTGAG CCTCCTTGAT 480 

ACCTACCAAA AATGTGGACT CAAGCGTAAG AGCGAGGAGA TCGAGAACAC AAGCAGCGTG 540 

CAGATCATCG AGGAGCATCC ACCCATGATT CAGAATAATG CAAGCGGGGC CACTGTCGCC 600 

ACTGCCAOCA CGTCTACTGC CACCTCCAAA AACAGCGGCT OCAACAGCGA GGGCGACTAT 660 

CAGCTGGTGC AGCATGAGGT GCTGTGCTCC ATGACCAACA CCTACGAGGT CTTAGAGTTC 720 

TTGGGCCGAG GGACGTTTGG ACAAGTGGTC AAGTGCTGGA AACGGGGCAC CAATGAGATC 780 

GTAGCCATCA AGATCCTGAA GAACCGCCCA TCCTATGCCC GACAAGGTCA GATTGAAGTG 840 

AGCATCCTGG CCCGGTTGAG CACGGAGAGT GCCGATGACT ATAACTTCGT CCGGGCCTAC 900 

GAATGCTTCC AGCACAAGAA CCACACGTGC TTGGTCTTCG AGATGTTGGA GCAGAACCTC 960 

TATGACTTTC TGAAGCAAAA CAAGTTTAGC OCCTTGCCCC TCAAATACAT TCGCCCAGTT 1020 

CTCCAGCAGG TAGCCACAGC CCTGATGAAA CTCAAAAGCC TAGGTCTTAT CCACGCTGAC 1080 

CTCAAAOCAG AAAACATCAT GCTGGTGGAT CCATCTAGAC AACCATACAG AGTCAAGGTC 1140 

ATCGACTTTG GTTCAGCCAG OCACGTCTCC AAGGCTGTGT GCTCCACCTA CTTGCAGTCC 1200 

AGATATTACA GGGCCCCTGA GATCATCCTT GGTTTACCAT TTTGTGAGGC AATTGACATO 1260 

TGGTCCCTGG GCTGTGTTAT TGCAGAATTG TTCCTGGGTT GGCCGTTATA TCCAGGAGCT 1320 

TCGGAGTATG ATCAGATTCG GTATATTTCA CAAACACAGG GTTTGCCTGC TGAATATTTA 1380 

TTAAGCGCCG GGACAAAGAC AACTAGGTTT TTCAACCGTG ACACGGACTC ACCATATCCT 1440 

TTGTGGAGAC TGAAGACACC AGATGACCAT GAAGCAGAGA CAGGGATTAA GTCAAAAGAA 1500 

GCAAGAAAGT ACATTTTCAA CTGTTTAGAT GATATGGCCC AGGTGAACAT GACGACAGAT 1560 

TTGGAAGGGA GOGACATGTT GGTAGAAAAG GCTGAOCGGC GGGAGTTCAT TGACCTGTTG 1620 

AAGAAGATGC TGACCATTGA TGCTGACAAG AGAATCACTC CAATCGAAAC CCTGAACCAT 1680 

CCCTTTGTCA CCATGACACA CTTACTGGAT TTTCCCCACA GCACACACGT CAAATCATGT 1740 

TTCCAGAACA TGGAGATCTG CAAGCGTCGG GTGAATATGT ATGACACGGT GAACCAGAGC 1800 

AAAACCCCTT TCATCACGCA CGTGGCCCCC AGCACGTCCA CCAACCTGAC CATGACCTTT 1860 

AACAACCAGC TGACCACTGT CCACAACCAG GCTCCCTCCT CTACCAGTGC CACTATTTCC 1920 

TTAGCCAATC CCGAAGTCTC CATACTAAAC TACCCATCTA CACTCTACCA GCCCTCAGCG 1980 

GCATCCATGG CTGCAGTGGC CCAGCGGAGC ATGCCCCTGC AGACAGGAAC AGCCCAGATT 2040 

TGTGCCCGGC CTGACCCGTT CCAGCAAGCT CTCATCGTGT GTCCCCCCGG CTTCCAAGGC 2100 

TTGCAGGCCT CTCCCTCTAA GCACGCTGGC TACTCGGTGC GAATGGAAAA TGCAGTTCCC 2160 

ATCGTCACTC AAGCCOCAGG AGCTCAGCCT CTTCAGATCC AACCAGGTCT GCTTGCCCAG 2220 

CAGGCTTGGC CAAGTGGGAC CCAGCAGATC CTGCTTCCCC CAGCATGGCA GCAACTGACT 2280 

GGAGTGGCCA CCCACACATC AGTGCAGCAT GCCACCGTGA TTCCCGAGAC CATGGCAGGC 2340 

ACCCAGCAGC TGGCGGACTG GAGAAATACG CATGCTCACG GAAGCCATTA TAATCCCATC 2400 

ATGCAGCAGC CTGCACTATT GACCGGTCAT GTGACCCTTC CAGCAGCACA GCCCTTAAAT 2460 

GTGGGTGTGG CCCACGTGAT GCGGCAGCAG CCAACCAGCA CCACCTCCTC CCGGAAGAGT 2520 

AAGCAGCACC AGTCATCTGT GAGAAATGTC TCCACCTGTG AGGTGTCCTC CTCTCAGGCC 2580 

ATCAGCTCCC CACAGCGATC CAAGCGTGTC AAGGAGAACA CACCTCCCCG CTGTGCCATG 2640 

GTGCACAGTA GCCCGGCCTG CAGCACCTCG GTCACCTGTG GGTGGGGCGA CGTGGCCTCC 2700 

AGCACCACCC GGGAACGGCA GCGGCAGACA ATTGTCATTC CCGACACTCC CAGCCCCACG 2760 

GTCAGCGTCA TCACCATCAG CAGTGACACG GACGAGGAGG AGGAACAGAA ACACGCCCCC 2820 

ACCAGCACTG TCTCCAAGCA AAGAAAAAAC GTCATCAGCT GTGTCACAGT CCACGACTCC 2880 

CCCTACTCCG ACTCCTCCAG CAACACCAGC CCCTACTCCG TGCAGCAGCG TGCTGGGCAC 2940 
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AACAATGCCA ATGCCTTTGA CACCAAGGGG AGCCTGGASA ATCACTGCAC GGGGAACCCC 3000 

CGAACCATCA TCGTGCCACC CCTGAAAACC CAGGCCAGCG AAGTATTGGT GGAGTGTGAT 3060 

AGCCTGGTGC CAGTCAACAC CAGTCACCAC TCGTCCTCCT ACAAGTCCAA GTCCTCCAGC 3120 

AACGTCACCT CCACCAGCGG TCACTCTTCA GGGAGCTCAT CTGGAGCCAT CACCTACCGG 3180 

CAGCAGCGGC CGGGCCCCCA CTTCCAGCAG CAGCAGCCAC TCAATCTCAG CCAGGCTCAG 3240 

CAGCACATCA CCACGGACCG CACTGGGAGC CACCGAAGGC AGCAGGCCTA CATCACTCCC 3300 

ACCATGGCCC AGGCTCCGTA CTCCTTCCCG CACAACAGCC CCAGCCACGG CACTGTGCAC 3360 

CCGCATCTGG CTGCAGCCGC TGCCGCTGCC CACCTCCCCA CCCAGCCCCA CCTCTACACC 3420 

TACACTGCGC CGGCGGCCCT GGGCTCCACC GGCACCGTGG CCCACCTGGT GGCCTCGCAA 3480 

GGCTCTGCGC GCCACACCGT GCAGCACACT GCCTACCCAG OCAGCATCGT CCACCAGGTC 3540 

CCCGTGAGCA TGGGCCCCCG GGTCCTGCCC TCGCCCACCA TCCACCCGAG TCAGTATCCA 3600 

GCCCAATTTG CCCACCAGAC CTACATCAGC GCCTCGCCAG CCTCCACCGT CTACACTGGA 3660 

TACCCACTGA GCCOCGCCAA GGTCAACCAG TACCCTTACA TATAAACACT GGAGGGGAGG 3720 

GAGGGAGGGA GGGAGGGAGA GAATGGCCCG AGGGAGGAGG GAGAGAAGGA GGGAGGCGCT 3780 

CCTGGGACCG TGGGCGCTGG CCTTTTATAC TGAAGATGCC GCACACAAAC AATGCAAACG 3840 

GGGCAGGGGC GGGGGGGGGG GGGGCAGAGG GCAGGGGGAC GGGTCGGGAC ACCAGTGAAA 3900 

CTTGAACCGG GAAGTGGGAG GACGTAGAGC AGAGAAGAGA ACATTTTTAA AAGGAAGGGA 3960 
TTAAAGAGGG TGGGAAATCT ATGGTTTTTA TTTTAAAAAA 



SEQ ID Ntfc282 PCE Protein sequence; 
Protein Accession*: NPJD73577 

MAPVYEGMAS HVQVFSPHTL QSSAFCS VKK LKVEPSSNWD MTGYGSHSKV YSQSKN1PPS 60 
QPASTTVSTS LPVPNPSLPY EQTIVFPGST GH1VVTSASS TSVTGQVLGG PHNLMRRSTV 120 
SLLDTYQKCG LKRKSEEIEN TSS VQOEEH PPMIQNNA5G ATVATATTST ATSKNSGSNS 180 
EGDYQLVQHE VLCSMTNTYE VLEFLGRGTF GQWKCWKRG TNEIVADCIL KNRPSYARQG 240 
QIEVSUARL STES ADD YNF VRAYECFQHK KHTCLVEEML BQNLYDFLKQ NKFSPLPLKY 300 
IRPVLQQVAT ALMKLKSLGL IHADLKPENI MLVDPSRQPY RVKVIDPGSA SHVSKAVCST 360 
YLQSRYYRAP EHLGLPFCE AIDMWSLGCV IAELFLGWPL YPGASEYDQI RYISQTQGLP 420 
AEYLLSAGTK TTRFFNRDTD SPYPLWRLKT PDDHBAETGI KSKBARKYIF NCLDDMAQVN 480 
MTTDLEGSDM LVEKADRREF IDLLKKMLTI DADKRITPE TLNHPFVTMT HLLDFPHSTH 540 
VKSCPQNMEI CKRRVNMYDT VNQSKTPFIT HVAFSTSTNL TMTFNNQLTT VHNQAPSSTS 600 
ATLSLANPEV SILNYPSTLY QPSAASMAAV AQRSMPLQTG TAQICARPDP FQQAUVCPP 660 
GFQGLQ ASPS KHAGYS VRME NAVPIVTQAP GAQPLQIQPG LLAQQAWPSG TQQILLPPAW 720 
QQLTGVATHT SVQHATVTPE TMAGTQQLAD WRNTHAHGSH YNPIMQQPAL LTGHVTLPAA 780 
QPLNVGVAHV MRQQPTSTTS SRKSKQHQSS VRNVSTCBVS SSQAISSPQR SKRVKENTPP 840 
RCAMVHSSPA CSTSVTCGWG DVASSTTRER QRQTTVIPDT PSPTVS VHI SSDTDEEEEQ 900 
KHAPTSTVSK QRKNVISCVT VHDSP YSDSS SNTSPYS VQQ RAGHNNANAF DTKGSLENHC 960 
TGNPRTHVP PLKTQASEVL VECDSLVPVN TSHHSSSYKS KSSSNVTSTS GHSSGSSSGA 1020 
rTYRQQRPGP HFQQQQPLNL SQAQQHTTTD RTGSHRRQQA YTIPTMAQAP YSFPHNSPSH 1080 
GTVHPHLAAA AAAAHLPTQP HLYTYTAPAA LGSTGTVAHL VASQGSARHT VQHTAYPASI 1140 
VHQVPVSMGP RVLPSPTIHP SQYPAQFAHQ TYISASPAST VYTGYPLSPA KVNQYPYI 

SEQ ID N0283 PBY1 DNA SEQUENCE 

Nucleic Add Accession* NMJH7700 

Coding sequence 147-606 (undedmed sequence corresponds to start and stop codon) 

1 u 21 31 41 51 

I I I I I I 

AGTCACAGCC AGGTAACCCT GGAGTGAAGC GGTTTAGTTA GAAGGGAGCA GATAAACTCG 60 

TCACTCTAGT AGCTTTAACC CTCACCCTGA GGCACCTTAG CAATCAGCCA TTGCCTGCAA 120 

GCCTCGAAAG CTTGTCTTTG CCTAATATGG AGCCCAAAGA AGCCACTGGG AAAGAAAACA 180 

TGGTCACCAA GAAAAAGAAT CTGGCCTTCT TGAGGTCTAG ACTCTATATG CTGGAGAGAA 240 

GGAAGACTGA CACTGTGGTT GAGAGCAGTG TTTCTGGGGA CCACTCTGGC ACCTTGAGGA 300 

GGAGCCAATC TGACAGGACC GAATACAACC AGAAATTACA AGAAAAGATG ACTCCACAGG 360 

GTGAGTGTTC TGTAGCTGAG ACCTTAACCC CAGAGGAAGA GCATCATATG AAGAGGATGA 420 

TGGCAAAGCG GGAAAAGATC ATTAAGGAGC TGAIACAGAC AGAAAAGGAT TATCTCAATG 480 

ATCTAGAGCT GTGTGTTAGG GAAGTGGTTC AGCCCCTGAG AAATAAAAAG ACTGATAGGC 540 

TGGATGTGGA TAGCTTGTTT AGCAACATTG AGTCCGTGCA TCAGATATCA GCCAAGCTGC 600 

TGTCATTGTT GGAAGAGGCC ACAACAGACG TGGAACCGGC CATGCAAGTA ATTGGAGAA8 660 

TATTCTTGCA GATTAAAGGG CCACTGGAAG ATATTTATAA AATCTACTGC TATCACCATG 720 

ATGAAGCACA TAGTATACTG GAGTCCTATG AAAAGGAAGA AGAGCTGAAG GAACATTTGA 780 

GCCACTGTAT CCAGTCCTTA AAGTAAGGCC TTTTCAAATG ATGATTCCCA TCTCCTCTCA 840 

GTTGCCTAGC AGGGAACATT TTAAATGGAT GTAGATGAAA GGTCTCACAT AAATCCTATG 900 

TTTTATGAGA CTTGCTGGGA GCTCTGCTTT GCATTCCCTT TATAAAAAGC TGACATGCCA 960 

GAAGCCCTGA TTGACTTTTT TTCCCCCTGC GAGAATGACT AAAAATAACA TGGAAGAAGA 1020 

TTTAGAGCTC TGCAGCGATT GAAAAATGCA ATATCAAAAT ATAAAATGTG GAAGAAAAGC 1080 

CTCTTCTTAA AGCTATTGTA ACTTGCCTGG CCCCACGTAG TTCAAGGATT ATGTGAGATA 1140 

ACACGTGGCC CCATGACCAC TGGAGCACAT GGGTTAATGG AGTTAGGGGA ATGGCCTACA 1200 

ACTCTGCATG GCCGTCTTCT TTCCCCAAAC TCACTGTGGG GAGATGGGTG AAGACAAGTC 1260 

AGGCCTTGTT AAAGTTAGTT TCAGAACAAT TACTCATGCC TTCCTTTCTC ATCCCTAAAA 1320 

CATTGGTGGG GGAGCTACAC AATGTACTTT TTCTTTTCTA GAGGAAOTAT CTATTCACTG 1380 

TGAAAATCTG AAAAATATAA CAAAGTATGT GTAAGATAAA AACCCCTTGC TATTTCAAAA 1440 
AAAAAAAAAA AAAAAAAAAA AAAA 

sra in nn-7M pBYl Protein sequence 
Protein Accession #: NPJJ60170 

1 11 21 



31 41 51 
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MEPKEATGKB NMVTKKKNLA FLRSRLYMLB RRKTDTWES SVSGDHSGTL RRSQSDRTBY 60 

NQKLQEKMTP QGECSVAETL TPEEEHHMKR MMAKREKIIK BLIQTEKDYL KDLELCVREV 120 

VQPLRNKKTD RLDVDSLFSK IESVHQISAK LLSLLEEATT DVEPAMQVIG EVFLQIKQPL 180 
EDIYKIYCYH HDEAKSILES YEKEEELKEH LSHCIQSLK 



SEQ ID NO:285 PBQ9 DNA SEQUENCE 

Nuddc Add Accession* X66534 

Coding sequence: 523-2876 (undefined sequence corresponds to start and stop codon) 



1 11 21 31 41 SI 

I I I I I I 

CCCTTATGGC GATTGGGCGG CTGCAQAQAC CAGGACTCAG TTCCCCTGCC CTAGTCTGAG 60 

CCTAGTGGGT GGGACTCAGC TCAGAGTCAG TTTTCAGAAG CAGGTTTCAG TTGCAGAGTT 120 

TTCCTACACT TTTCCTGCGC TAGAGCAGCG AGCAGCCTGG AACAGACCCA GGCGGAGGAC 180 

ACCTGTGGGG GAGGGAGCGC CTGGAGGAGC TTAGAGACCC CAGCCGGGCG TGATCTCACC 240 

ATGTGCGGAT TTGCGAGGCG CGCCCTGGAG CTGCTAGAGA TCCGGAAGCA CAGCCCCGAG 300 

GTGTGCGAAG CCACCAAGAC TGCGGCTCTT GGAGAAAGCG TGAGCAGGGG GCCACCGCGG 360 

TCTCCGGCCT GTCTGCACCC TGTCGCCTGA GCTGCCTGAC AGTGACAATG ACATCCCAGT 420 

TACCAGTGTC CTTGAATTGA TAGTGGCTTC TGTTTGTCAG TCTCATATAA GAACTACAGC 480 

TCATCAGGAG GAGATCGCAG CAGGGTAAGA GACACCAACA CCATGTTCTG CACGAAGCTC 540 

AAGGATCTCA AGATCACAGG AGAGTGTCCT TTCTCCTTAC TGGCACCAGG TCAAGTTCCT 600 

AACGAGTCTT CAGAGGAGGC AGCAGGAAGC TCAQAGAGCT GCAAAGCAAC CGTGCCCATC 660 

TGTCAAGACA TTCCTGAGAA GAACATACAA GAAAGTCTTC CTCAAAGAAA AACCAGTCGG 720 

AGCCGAGTCT ATCTTCACAC TTTGGCAGAG AGTATTTGCA AACTGATTTT CCCAGAGTTT "280 

GAACGGCTGA ATGTTGCACT TCAGAGAACA TTGGCAAAGC ACAAAATAAA AGAAAGCAGG 840 

AAATCTTTGG AAAGAGAAGA CTTTGAAAAA ACAATTGCAG AGCAAGCAGT GCAGCAGAGT 900 

CCAGTGGAGT TATCAAAGAA TCTCTTGGTG AAGAGGTTTT TAAAA TATGT TACGAGGAAG 960 

ATGAAAACAT CCTTGGGGTG GTTGGAGGCA CCCTTAAAGA TTTTTAAACA GCTTCAGTAC 1020 

CCTTCTGAAA CAGAGCAGCC ATTGCCAAGA AGCAGGAAAA AGGGGCAGCT TGAGGACGCC 1080 

TCCATTCTAT GCCTGGATAA GGAGGATGAT TTTCTACATG TTTACTACTT CTTCCCTAAG 1140 

AGAACCACCT CCCTGATTCT TCCCGGCATC ATAAAGGCAG CTGCTCACGT ATTATATGAA 1200 

ACGGAAGTGG AAGTGTCGTT AATGCCTCCC TGCTTCCATA ATGATTGCAG CGAGTTTGTG 1260 

AATCAGCCCT ACTTGTTGTA CTCCGTTCAC ATGAAAAGCA CCAAGCCATC CCTGTCCCCC 1320 

AGCAAACCCC AGTCCTCGCT GGTGATTCCC ACATCGCTAT TCTGCAAGAC ATTTCCATTC 1380 

CATTTCATGT TTGACAAAGA TATGACAATT CTGCAATTTG GCAATGGCAT CAGAAGGCTG 1440 

ATGAACAGGA GAGACTTTCA AGGAAAGCCT AATTTTGAAT ACTTTGAAAT TCTGACTCCA 1500 

AAAATCAACC AGACCTTTAG CGGGATCATG ACTATGTTGA ATATGCAGTT TGTTGTACGA 1560 

GTGAGGAGAT GGGACAACTC TGTGAAGAAA TCTTCAAGGG TTATGGACCT CAAAGGCCAA 1620 

ATGATCTACA TTGTTGAATC CAGTGCAATC TTGTTTTTGG GGTCACCCTG TGTGGACAGA 1680 

TTAGAAGATT TTACAGGACG AGGGCTCTAC CTCTCAQACA TCCCAATTCA CAATGCACTG 1740 

AGGGATGTGG TCTTAATAGG GGAACAAGCC CQAGCTCAAG ATGGCCTGAA GAAGAGGCTG 1800 

GGGAAGCTGA AGGCTACCCT TGAGCAAGCC CACCAAGCCC TGGAGGAGGA GAAGAAAAAG 1860 

ACAGTAGACC TTCTGTGCTC CATATTTCCC TGTGAGGTTG CTCAGCAGCT GTGGCAAGGG 1920 

CAAGTTGTGC AAGCCAAGAA GTTCAGTAAT GTCACCATGC TCTTCTCAGA CATCGTTGGG 1980 

TTCACTGCCA TCTGCTCCCA GTGCTCACCG CTGCAGGTCA TCACCATGCT CAATGCACTG 2040 

TAGACTCGCT TCGACCAGCA GTGTGGAGAG CTGGATGTCT ACAAGGTGGA GACCATTGCG 2100 

ATGCCTATTG TGTGGCTTGG GGGATTACAC AAAGAGAGTG ATACTCATGC TGTTCAGATA 2160 

GCGCTGATGG CCCTGAAGAT GATGGAGCTC TCTGATGAAG TTATGTCTCC CCATGGAGAA 2220 

CCTATCAAGA TGCGAATTGG ACTGCACTCT GGATCAGTTT TTGCTGGCGT CGTTGGAGTT 2280 

AAAATGCCCC GTTACTGTCT TTTTGGAAAC AATGTCACTC TGGCTAACAA ATTTGAGTCC 2340 

TGCAGTGTAC CACGAAAAAT CAATGTCAGC CCAACAACTT ACAGATTACT CAAAGACTGT 2400 

CCTGGTTTCG TGTTTACCCC TCGATCAAGG GAGGAACTTC CACCAAACTT CCCTAGTGAA 2460 

ATCCCCGGAA TCTGCCATTT TCTGGATGCT TACCAACAAG GAACAAACTC AAAACCATGC 2520 

TTCCAAAAGA AAGATGTGGA AGATGCAAGC CAATTTTTTA GGCAAAGCAT CAGGAATAGA 2580 

TTAGCAACCT ATATACCTAT TTATAAGTCT TTGGGGTTTG ACTCATTGAA GATGTGTAGA 2640 

GCCTCTGAAA GCACTTTAGG GATTGTAGAT GGCTAACAAG CAGTATTAAA ATTTCAGGAG 2700 

CCAAGTCACA ATCTTTCTCC TGTTTAACAT GACAAAATGT ACTCACTTCA GTACTTCAGC 2760 

TCTTCAAGAA AAAAAAAAAA ACCTTAAAAA GCTACTTTTG TGGGAGTATT TCTATTATAT 2820 

AACCAGCACT TACTACCTGT ACTCAAAATT CAGCACCTTG TACATATATC AGATAATTGT 2680 

AGTCAATTGT ACAAACTGAT GGAGTCACCT GCAATCTCAT ATCCTGGTGG AATGCCATGG 2940 

TTATTAAAGT GTGTTTGTGA TAGTTGTCGT CAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 3000 
AAAA 

Protein Accession ^ ^Qolioa 8 " 0 ^ 

1 11 21 31 41 51 

I I I I I I 

MFCTKLKDLK ITGECPPSLL APGQVPNESS EEAAGSSESC KATVPICQDI PEKNIQESLP 60 

QRKTSRSRVY LHTLAESICK LIPPEPERLN VALQRTLAKH KIKESRKSLE REDFEKTIAE 120 

CAVAAGVPVE VIKESLGEEV FKICYEEDEN ILGWGGTLK DFLNSPSTLL KQSSHCQEAG 180 

KRGRLEDASI LCLDKEDDFL HVYYFPPKRT TSLILPGIIK AAAHVLYETE VEVSLMPPCP 240 

HNDCSEFVNQ PYLLYSVHMK STKPSLSPSK PQSSLVIPTS LFCKTPPFHP MFDKDMTILQ 300 

FGNGIRRLMN RRDPQGKPNP EEYFETLTPK INQTFSGIMT MLNMQFWRV RRWDNSVKKS 360 

SRVHDLKGQM IYIVESSAIL FLGSPCVDRL EDFTGRGLYL SDIPIHNALR DWLIGECAR 420 

AQDGLKKRLG KLKATLEQAH QALEEEKKKT VDLLCSIFPC EVAQQLWQGQ WQAKKFSNV 480 

TMLFSDIVGF TAICSQCSPL QVITMUIALY TRFDQQCGEL DVYKVETIGD AYCVAGGLHK 540 
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ESDTHAVQIA LMALKMMELS DEVMSPHGEP IKMRIGLHSG SVFAGWGVK MPRYCLFGNN 600 
VTLANKFESC SVPRK1NVSP TTVRLLKDCP GFVFTPRSRE ELPPNFPSEI PGICHFLDAY. 660 
QQGTNSKPCF QKKDVBDGNA NFLGRASGID 



SEQ ID N0287 PFD2 DNA SEQUENCE 

Nud^c Acid Accessions NM_000720 

Coding sequence: 1 1 94664 (underlined sequence corresponds to start and stop codon) 



1 11 21 31 41 51 

I I I I I I 

AGAATAAGGG CAGGGACCGC GGCTCCTATC TCTTGGTGAT CCCCTTCCCC ATTCCGCCCC 60 

CGCCTCAACG OCCAGCACAG TGCCCTGCAC ACAGTAGTCG CTCAATAAAT GTTCGTGGAT 120 

GATGATGATG ATGATGATGA AAAAAATGCA GCATCAACGG CAGCAGCAAG CGGACCACGC 180 

GAACGAGGCA AACTATGCAA GAGGCACCAG ACTTCCTCTT TCTGGTGAAG GACCAACTTC 240 

TCAGCCGAAT AGCTCCAAGC AAACTGTCCT GTCTTGGCAA GCTGCAATCG ATGCTGCTAG 300 

ACAGGCCAAG GCTGCCCAAA CTATGAGCAC CTCTGCACCC CCACCTGTAG GATCTCTCTC 360 

CCAAAGAAAA CGTCAGCAAT ACGCCAAGAG CAAAAAACAG GGTAACTCGT CCAACAGCCG 420 

ACCTGCCCGC GCOCTTTTCT GTTTATCACT CAATAACCCC ATCCGAAGAG CCTGCATTAG 480 

TATAGTGGAA TGGAAACCAT TTGACATATT TATATTATTG GCTATTTTTG CCAATTGTGT 540 

GGCCTTAGCT ATTTACATCC CATTCCCTGA AGATGATTCT AATTCAACAA ATCATAACTT 600 

GGAAAAAGTA GAATATGCCT TCCTGATTAT TTTTACAGTC GAGACATTTT TGAAGATTAT 660 

AGCGTATGGA TTATTGCTAC ATCCTAATGC TTATGTTAGG AATGGATGGA ATTTACTGGA 720 

TTTTGTTATA GTAATAGTAG GATTGTTTAG TGTAATTTTG GAACAATTAA OCAAAGAAAC 780 

AGAAGGCGGG AACCACTCAA GCGGCAAATC TGGAGGCTTT GATGTCAAAG CCCTCCGTGC 840 

CTTTCGAGTG TTGCGACCAC TTCGACTAGT GTCAGGGGTG CCCAGTTTAC AAGTTGTCCT 900 

GAACTCCATT ATAAAAGCCA TGGTTCCCCT CCTTCACATA GCCCI TI ' I GG TATTATTTGT 960 

AATCATAATC TATGCTATTA TAGGATTGGA ACTTTTTATT GGAAAAATGC ACAAAACATG 1020 

TTTTTTTGCT GACTCAGATA TCGTAGCTGA AGAGGACCCA GCTCCATGTG CGTTCTCAGG 1080 

GAATGGACGC CAGTGTACTG CCAATGGCAC GGAATGTAGG AGTGGCTGGG TTGGCCCGAA 1140 

CGGAGGCATC ACCAACTTTG ATAACTTTGC CTTTGCCATG CTTACTGTGT TTCAGTGCAT 1200 

CACCATGGAG GGCTGGACAG ACGTGCTCTA CTGGGTAAAT GATGCGATAG GATCGGAATG 1260 

GCCATGGGTG TATTTTGTTA GTCTGATCAT CCTTGGCTCA TTTTTCGTCC TTAACCTGGT 1320 

TCTTGGTGTC CTTAGTGGAG AATTCTCAAA GGAAAGAGAG AAGGCAAAAG CACGGGGAGA 1380 

TTTCCAGAAG CTCCGGGAGA AGCAGCAGCT GGAGGAGGAT CTAAAGGGCT ACTTGGATTG 1440 

GATCACCCAA GCTGAGGACA TCGATCCGGA GAATGAGGAA GAAGGAGGAG AGGAAGGCAA 1500 

ACGAAATACT AGCATGCCCA CCAGCGAGAC TGAGTCTGTG AACACAGAGA ACGTCAGCGG 1560 

TGAAGGCGAG AACCGAGGCT GCTGTGGAAG TCTCTGGTGC TGGTGGAGAC GGAGAGGCGC 1620 

GGCCAAGGCG GGGOCCTCTG GGTGTCGGCG GTGGGGTCAA GCCATCTCAA AATCCAAACT 1680 

CAGCCGACGC TGGCGTCGCT GGAACCGATT CAATCGCAGA AGATGTAGGG CCGCCGTGAA 1740 

GTCTGTCACG TTTTACTGGC TGGTTATCGT CCTGGTGTTT CTGAACACCT TAACCATTTC 1800 

CTCTGAGCAC TACAATCAGC CAGATTGGTT GACACAGATT CAAGATATTG CCAACAAAGT 1860 

CCTCTTGGCT CTGTTCACCT GCGAGATGCT GGTAAAAATG TACAGCTTGG GCCTCCAAGC 1920 

ATATTTCGTC TCTCTTTTCA ACCGGTTTGA TTGCTTCGTG GTGTGTGGTG GAATCACTGA 1980 

GACGATCCTG GTGGAACTGG AAATCATGTC TCCCCTGGGG ATCTCTGTGT TTCGGTGTGT 2040 

GCGCCTCTTA AGAATCTTCA AAGTGACCAG GCACTGGACT TCCCTGAGCA ACTTAGTGGC 2100 

ATCCTTATTA AACTCCATGA AGTCCATCCC TTCGCTGTTG CTTCTGCTTT TTCTCTTCAT 2160 

TATCATCTTT TCCTTGCTTG GGATGCAGCT GTTTGGCGGC AAGTTTAATT TTGATGAAAC 2220 

GCAAACCAAG CGGAGCACCT TTGACAATTT CCCTCAAGCA CTTCTCACAG TGTTCCAGAT 2280 

CCTGACAGGC GAAGACTGGA ATGCTGTGAT GTACGATGGC ATCATGGCTT ACGGGGGCCC 2340 

ATCCTCTTCA GGAATGATCG TCTGCATCTA CTTCATCATC CTCTTCATTT GTGGTAACTA 2400 

TATTCTACTG AATGTCTTCT TGGCCATCGC TGTAGACAAT TTGGCTGATG CTGAAAGTCT 2460 

GAACACTGCT CAGAAAGAAG AAGOGGAAGA AAAGGAGAGG AAAAAGATTG CCAGAAAAGA 2520 

GAGCCTAGAA AATAAAAAGA ACAACAAACC AGAAGTCAAC CAGATAGCCA ACAGTGACAA 2580 

CAAGGTTACA ATTGATGACT ATAGAGAAGA GGATGAAGAC AAGGACCCCT ATCCGCCTTG 2640 

CGATGTGCCA GTAGGGGAAG AGGAAGAGGA AGAGGAGGAG GATGAACCTG AGGTTCCTGC 2700 

CGGACCCCGT CCTCGAAGGA TCTCGGAGTT GAACATGAAG GAAAAAATTG GCCCCATCCC 2760 

TGAAGGGAGC GCTTTCTTCA TTCTTAGCAA GACCAACCCG ATCCGCGTAG GCTGCCACAA 2820 

GCTCATCAAC CACCACATCT TCACCAACCT CATCCTTGTC TTCATCATGC TGAGCAGCGC 2880 

TGCCCTGGCC GCAGAGGACC CCATCCGCAG CCACTCCTTC CGGAACACGA TACTGGGTTA 2940 

CTTTGACTAT GCCTTCACAG CCATCTTTAC TGTTGAGATC CTGTTGAAGA TGACAACTTT 3000 

TGGAGCTTTC CTCCACAAAG GGGCCTTCTG CAGGAACTAC TTCAATTTGC TGGATATGCT 3060 

GGTGGTTGGG GTGTCTCTGG TGTCATTTGG GATTCAATCC AGTGCCATCT CCGTTGTGAA 3120 

GATTCTGAGG GTCTTAAGGG TCCTGCGTCC CCTCAGGGCC ATCAACAGAG CAAAAGGACT 3180 

TAAGCACGTG GTCCAGTGCG TCTTCGTGGC CATCCGGACC ATCGGCAACA TCATGATCGT 3240 

CACTACCCTC CTGCAGTTCA TGTTTGCCTG TATCGGGGTC CAGTTGTTCA AGGGGAAGTT 3300 

CTATCGCTGT ACGGATGAAG CCAAAAGTAA CCCTGAAGAA TGCAGGGGAC TTTTCATCCT 3360 

CTACAAGGAT GGGGATGTTG ACAGTCCTGT GGTCCGTGAA CGGATCTGGC AAAACAGTGA 3420 

TTTCAACTTC GACAAOGTCC TCTCTGCTAT GATCGCGCTC TTCACAGTCT CCACGTTTGA 3480 

GGGCTGGCCT GCGTTGCTGT ATAAAGCCAT CGACTCGAAT GGAGAGAACA TCGGCCCAAT 3540 

CTACAACCAC CGCGTGGAGA TCTCCATCTT CTTCATCATC TACATCATCA TTGTAGCTTT 3600 

CTTCATGATG AACATCTTTG TGGGCTTTGT CATCGTTACA TTTCAGGAAC AAGGAGAAAA 3660 

AGAGTATAAG AACTGTGAGC TGGACAAAAA TCAGCGTCAG TGTGTTGAAT ACGCCTTGAA 3720 

AGCACGTCCC TTGCGGAGAT ACATCCCCAA AAACCCCTAC CAGTACAAGT TCTGGTACGT 3780 

GGTGAACTCT TCGCCTTTCG AATACATGAT GTTTGTCCTC ATCATGCTCA ACACACTCTG 3840 

CTTGGCCATG CAGCACTACG AGCAGTCCAA GATGTTCAAT GATGCCATGG ACATTCTGAA 3900 

CATGGTCTTC ACCGGGGTGT TCACCGTCGA GATGGTTTTG AAAGTCATCG CATTTAAGCC 3960 

TAAGGGGTAT TTTAGTGACG CCTGGAACAC GTTTGACTCC CTCATCGTAA TCGGCAGCAT 4020 

TATAGACGTG GCCCTCAGCG AAGCGGACCC AACTGAAAGT GAAAATGTCC CTGTCCCAAC 4080 
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TGCTACACCT GGGAACTCTG AAGAGAGCAA TAGAATCTCC ATCACCTTTT TCCOTCTTTT 4140 

CCGAGTGATG CGATTGGTGA AGCTTCTCAG CAGGGGGGAA GGCATCCGGA CATTGCTGTG 4200 

GACTTTTATT AAGTCCTTTC AGGCGCTCCC GTATGTGGCC CTCCTCATAG CCATGCTGTT 4260 

CTTCATCTAT GOGGTCATTG GCATGCAGAT GTXTGGGAAA GTTGCCATGA GAGATAACAA 4320 

CCAGATCAAT A6GAACAATA ACTTCCAGAC GTTTCCCCA6 GCGGTGCTGC TGCTCTTCAG 4380 

GTGTG CAACA GGTGAGGCCT GGCAGGAGAT CATGCTGGCC TGTCTCCCAG GGAAGCTCTG 4440 

TGACCCTGA6 TCAGATTACA ACCCCGGGOA GGAQTATACA TGTGGGAGCA ACTTTGCCAT 4500 

TGTCTATTTC ATCAGTTTTT ACATGCTCTG TGCATTTCTG ATCATCAATC TGTTTGTGGC 4560 

TGTCATCATG GATAATTTCG ACTATCTGAC CCGGGACTGG TCTATTTT6G GGCCTCACCA 4620 

TTTAGATGAA TTCAAAAGAA TATGGTCAGA ATATGACCCT GAGGCAAAGG GAAGGATAAA 4680 

ACACCTTGAT GTGGTCACTC TGCTTCGACG CATCCAGCCT CCCCTGGGGT TTGGGAAGTT 4740 

ATGTCCACAC AGGGTAGCGT GCAAGAQATT AGTTGCCATG AACATGCCTC TCAACAGTGA 4600 

CGGGACAGTC ATGTTTAATG CAACCCTGTT TGCTTTGGTT CGAACGGCTC TTAAGATCAA 4860 

GACCGAAGGG AACCTGGAGC AAGCTAATGA AGAACTTCGG GCTGTGATAA AGAAAATTTG 4920 

GAAGAAAACC AGCATGAAAT TACTTGACCA AGTTGTCCCT CCAGCTGGTC ATGATGAGGT 4980 

AACCGTGGGG AAGTTCTATG CCACTTTCCT GATACAGGAC TACTTTASGA AATTCAAGAA 5040 

ACGGAAAGAA CAAGGACTGG TGGGAAAGTA CCCTGCGAAG AACACCACAA TTGCCCTACA 5100 

GGCGGGATTA AGGACACTGC ATGACATTGG GCCAGAAATC CGGCGTGCTA TATCGTGTGA 5160 

TTTGCAAGAT GACGAGCCTG AGGAAACAAA ACGAGAAGAA GAAGATGATG TGTTCAAAAG 5220 

AAATGGTGCC CTGCTTGGAA ACCATGTCAA TCATGTTAAT AGTGATAGGA GAGATTOCCT 5280 

TCAGCAGACC AATACCACCC ACCGTCCCCT GCATGTCCAA AGGCCTTCAA TTCCACCTGC 5340 

AAGTGATACT GAGAAACCGC TGTTTCCTCC AGCAGGAAAT TCGGTGTGTC ATAACCATCA 5400 

TAACCATAAT TCCATAGGAA AGCAAGTTCC CACCTCAACA AATGCCAATC TCAATAATGC 5460 

CAATATGTCC AAAGCTGCCC ATGGAAAGCG GCCCAGCATT GGGAACCTTG AGCATGTGTC 5520 

TGAAAATGGG CATCATTCTT CCCACAAGCA TGACCGGGAG CCTCAGAGAA GGTCCAGTGT 5580 

GAAAAGAACC CGCTATTATG AAACTTACAT TAGGTCCGAC TCAGGAGATG AACAGCTCCC 5640 

AACTATTTGC CGGGAAGACC CAGAGATACA TGGCTATTTC AGGGACCCCC ACTGCTTGGG 5700 

GGAGCAGGAG TATTTCAGTA GTGAGGAATG CTACGAGGAT GACAGCTCGC CCACCTGGAG 5760 

CAGGCAAAAC TATGOCTACT ACAGCAGATA CCCAGGCAGA AACATCGACT CTGAGAGGCC 5820 

CCGAGGCTAC CATCATCCCC AAGGATTCTT -GGAGGACGAT GACTCGCCCG TTTGCTATGA 5880 

TTCACGGAGA TCTCCAAGGA GACGCCTACT ACCTCCCACC CCAGCATCCC ACOGGAGATC 5940 

CTCCTTCAAC TTTGAGTGCC TGCGCCGGCA GAGCAGCCAG GAAGAGGTCC CGTCGTCTCC 6000 

CATCTTCCCC CATCGCACGG CCCTGCCTCT GCATCTAATG CAGCAACAGA TCATGGCAGT 6060 

TGCCGGCCTA GATTCAAGTA AAGCCCAGAA GTACTCACCG AGTCACTCGA CCCGGTCGTG 6120 

GGCCACCCCT CCAGCAACCC CTCCCTACCG GGACTGGACA CCGTGCTACA CCCCCCTGAT 6180 

CCAAGTGGAG CAGTCAGAGG CCCTGGACCA GGTGAACGGC AGCCTGCCGT CCCTCCACCG 6240 

CAGCTCCTGG TACACAGACG AGCCCGACAT CTCCTACCGG ACTTTCACAC CAGCCAGCCT 6300 

GACTCTCCCC AGCAGCTTCC GGAACAAAAA CAGCGACAAG CAGAGGAGTG CGGACAGCTT 6360 

GGTGGAGGCA GTCCTGATAT CCGAAGGCTT GGGACGCTAT GCAAGGGACC CAAAATTTGT 6420 

GTCAGCAACA AAACACGAAA TCGCTGATGC CTGTGACCTC ACCATCGACG AGATGGAGAG 6480 

TGCAGCCAGC ACCCTGCTTA ATGGGAACGT GCGTCCCCGA GOCAACGGGG ATGTGGGCCC 6540 

CCTCTCACAC CGGCAGGACT ATGAGCTACA GGACTTTGGT CCTGGCTACA GCGACGAAGA 6600 

GCCAGACCCT GGGAGGGATG AGGAGGACCT GGCGGATGAA ATGATATGCA TCACCACCTT 6660 

GTAGCCCCCA GCGAGGGGCA GACTGGCTCT GGCCTCAGGT GGGGCGCAGG AGAGCCAGGG 6720 

GAAAAGTGCC TCATAGTTAG GAAAGTTTAG GCACTAGTTG GGAGTAATAT TCAATTAATT 6780 

AGACTTTTGT ATAAGAGATG TCATGCCTCA AGAAAGCCAT AAACCTGGTA GGAACAGGTC 6840 

CCAAGCGGTT GAGCCTGGCA GAGTACCATG CGCTCGGCCC CAGCTGCAGG AAACAGCAGG 6900 

CCCCGCCCTC TCACAGAGGA TGGGTGAGGA GGCCAGACCT GCCCTGCCCC ATTGTCCAGA 6960 

TGGGCACTGC TGTGGAGTCT GCTTCTCCCA TGTACCAGGG CACCAGGCCC ACCCAACTGA 7020 

AGGCATGGCG GCGGGGTGCA GGGGAAAGTT AAAGGTGATG AOGATCATCA CACCTCGTGT 7080 

CGTTACCTCA GCCATCGGTC TftGCATATCA GTCACTGGGC CCAACATATC CATTTTTAAA 7140 
CCCTTTCCCC CAAATACACT GCGTCCTGGT TCCTGTTTAG CTGTTCTGAA ATA 

SEQ K) Kfr288 PfP? Pfft^n ffflugnre; 
Protein Accession* A38198 



1 11 21 31 41 51 

I I 1 I I I 

MMMMMKHKKH QHQRQQQADH ANEANYARGT RLPLSGEGFT SQPNSSKQTV LSWQAAZDAA 60 

RQAKAAGTMS TSAPPPVGSL SQRKRQQYAK SKKQGNSSNS RPARALFCLS UJNPIRRACI 120 

SIVEWKPFDI FILLAIFANC VALAIYIPFP EDDSNSTNHN LEKVEVAFLI IFTVETFLKI 180 

IAYGLLLHPN AYVRNGWNLL DFVIVTVGLF SVTLEQLTKE TBGGNHSSGK SGGFDVKALR 240 

AFRVLRPUU. VSGVPSLQW LNSIIKAMVP LLHIALLVLF VIIIYAIIGL ELFIGKMHKT 300 

CFFADSDIVA EEDPAPCAFS GNGRQCTANG TBCRSGWVGP NGGITNFDNF APAHLTVFQC 360 

ITMEGWTDVL WVNDAIGWE WPWVYFVSLI ILGSFFVLNL VLGVLSGEPS KEREKAKARG 420 

DPQKLREKQQ LEEDLKGYLD WITQAEDIDP ENEEEGGEEG KRNTSMPTSE TESVNTENVS 480 

GEGENRGCCG SLWCWWRRRG AAKAGPSGCR RWGQAISKSK LSRRWRRWNR FNRRRCRAAV 540 

KSVTPYWLVI VLVFLNTLTI SSEHYNQPDW LTQIQDIANK VLLALFTCEM LVKMYSLGLQ 600 

AYFVSLFNRF DCPWCGGIT ETILVELEIM SPLGISVPRC VRLlRIPKVT RHWTSLSNLV 660 

ASLLNSMKSI ASLLLLLFLF IIIFSLLGMQ LFGGKPNPDE TQTKRSTFDN FPQALLTVFQ 720 

ILTGEDWNAV MYDGIMAYGG PSSSGMIVCI YFIILFICGN YILLNVFLAI AVDNLADAES 780 

UJTAQKEEAE EKERKKIARK ESLENKKNNK PEVNQIANSD NKVTIDDYRE EDEDKDFYPP 640 

CDVPVGEEEE EEEEDEpEVP AGPRPRRISE LNMKEKIAPI PEGSAFFILS KTNPIRVGCH 900 

KLINHHTFTN LILVFIMLSS AALAAEDPIR SHSFRNTILG YFDYAFTAIF TVEILLKMTT 960 

FGAFLHKGAF CRNYFNLLDM LWGVSLVSF GIQSSAISW KILRVLRVLR PLRAINRAKG 1020 

LKHWQCVPV AIRTIGNim VTTLLQFHFA CIGVQLFKGK FYRCTDEAKS NPEECRGLPI 1080 

LYKDGDVDSP WRERIWQNS DFHFDNVLSA MHALFTVBTF EGWPALLYXA IDSKGENIGP 1140 

IYNHRVEISI PFIIYIIIVA FFMMNIFVGF VIVTFQEQGB KEYKNCELDK NQRQCVEYAL 1200 

KARPLRRYIP KNPYQYRFWY WNSSPFEYM HFVLIMLNTL CLAMQHYEQS KHFNDAMDIL 1260 

NMVFTGVFTV EMVLKVTAFK PKGYFSDAWN TFDSLIVTGS IIDVALSEAD PTESENVPVP 1320 

423 



WO 02/30268 



TATPGNSEES NRISITFFRL FRVMRLVKLL SRGEGIRTLL WTFIKSPQAL PYVALLIAML 1380 

FFIYAVZGMQ MFGKVAMRDN NQINRNNNPQ TFPQAVLLLF RCATGEAWQE IMLACLPGKL 1440 

CDPESDYNFG EEYTCGSNFA 1VYFISFYML CAFLIINLFV AVUdDNFDYL TRDWSILGPH 1500 

HLDEPKRIWS EYDPEAKGRI KHLDWTLLR RIQPPLGFGK LCPHRVACKR LVAMNMPLNS 1560 

DGTVMFNATL FALVRTALKI KTEGNLEQAN EEL RAVI KKI WKKTSMKLLD QWPPAGDDE 1620 

VTVGKPYATP LIQDYFRKFK KRKEQ6LVGK YPAKNTTIAL QAGLRTLHDI GPEIRRAISC 1680 

DLQDDEPEET KREEEDDVFK RNGALLGNHV NHVNSDRRDS LQQTNTTHRP LHVQRPSIPP 1740 

ASDTEKPLFP PAGNSVCHNH HNHNSIGKQV PTSTNANLNN ANMSKAAHGK RPSIGNLEHV 1800 

SENGHHSSHK HDREPQRRSS VKRTRYYETY IRSDSGDEQL PTICREDPEI HGYFRDPHCL 1860 

GEQEYFSSEE CYEDDSSPTW SRQNYGYYSR YPGRNIDSER PRGYHHPQGF LEDDDSFVCY 1920 

DSRRSPRRRL LPPTPASHRR SSFNFECLRR QSSQBEVPSS PIPPHRTALP LHLMQQQIMA 1980 

VAGLDSSKAQ KYSPSHSTRS WATPPATPPY RDWTPCYTPL IQVEQSEALD GVNGSLPSLH 2040 

RSSWYTDEPD ISYRTFTPAS LTVPS5FRNK NSDKQRSADS LVEAVLISEG LGRYARDPKF 2100 

VSATKHEIAD ACDLTIDEME SAASTLLNGN VRPRANGDVG PLSHRQDYEL QDFGPGYSDE 2160 
EPDPQRDEED LADEMICITT L 

SEQ ID mm OBI6 DMA SEQUENCE 

Nucleic Add Accession*: NM.002812 

Codjig sequence: 150-3382 (underfed sequence corresponds to start and stop codon) 



1 11 21 31 41 51 

I I I I I 1 

AACTCCOGCC TCGGGACGCC TCGGGGTCGG GCTCCGGCTG CGGCTGCTGC TGCGGCGCCC 60 

GCGCTCOGGT GCGTCCGCCT CCTGTGCCOG CCGCGGAGCA GTCTGC GG CC OGCCGTGCGC 120 

CCTCAGCTCC TTTTCCTGAG CCOGOCGC GA TG GGAGCTGC GCGGGGATCC CCGGCCAGAC 180 

COOSCCGGTT GCCTCTGCTC AGCGTCCTGC TGCTGCCGCT GCTGGGCGGT ACCCAGACAG 240 

CCATTGTCTT CATCAAGCAG CCGTCCTCCC AGGATGCACT GCAGGGGCGC CGGGCGCTGC 300 

TTCGCTGTGA GGTTGAGGCT CCGGGCCCGG TACATGTGTA CTGGCTGCTC GATGGGGCCC 360 

CTGTCCAGGA CACGGAGCGG CGTTTCGCCC AGGGCAGCAG CCTGAGCTTT GCAGCTGTGG 420 

ACCGGCTGCA GGACTCTGGC ACCTTCCAGT GTGTGGCTCG GGATGATGTC ACTGGAGAAG 480 

AAGCCCGCAG TGCCAACGCC TCCTTCAACA TCAAATGGAT TGAGGCAGGT CCTGTGGTCC 540 

TGAAGCATCC AGCCTCGGAA GCTGAQATCC AGCCACACSAC CCAGGTCACA CTTCGTTGCC 600 

ACATTGATGG GCACCCTCGG CCCACCTACC AATGGTTCCG AGATGGGACC CXCCTTTCTG 660 

ATGGTCAGAG CAACCACACA GTCAGCAGCA AGGAGCGGAA CCTGACGCTC CGGCCAGCTG 720 

GTCCTGAGCA TAGTGGGCTG TATTCCTGCT GCGCCCACAG TGCTTTTGGC CAGGCTTGCA 780 

GCAGCCAGAA CTTCACCTTG AGCATTGCTG ATGAAAGCTT TGCCAGGGTG GTGCTGGCAC 840 

CCCAGGACGT GGTAGTAGCG AGGTATGAGG AGGCCATGTT CCATTGCCAG TTCTCAGCCC 900 

AGCCACCCCC GAGCCTGCAG TGGCTCTTTG AGGATGAGAC TCCCATCACT AACCGCAGTC 960 

GCCCCCCACA CCTCCGCAGA GCCACAGTGT TTGCCAACGG GTCTCTGCTG CTGACCCAGG 1020 

TCCGGCCACG CAATGCAGGG ATCTACCGCT GCATTGGCCA GGGGCAGAGG GGCCCACCCA 1080 

TCATCCTGGA AGCCACACTT CACCTAGCAG AGATTGAAGA CATGCCGCTA TTTGAGCCAC 1140 

GGGTGTTTAC AGCTGGCA6C GAGGAGCGTG TGACCTGCCT TCCCCCCAAG GGTCTGCCAG 1200 

AGCCCAGCGT GTGGTGGGAG CACGCGGGAG TCCGGCTGCC CACCCATGGC AGGGTCTACC 1260 

AGAAGGGCCA CGAGCTGGTG TTGGCCAATA TTGCTGAAAG TGATGCTGGT GTCTACACCT 1320 

GCCACGCGGC CAACCTGGCT GGTCAGCGGA GACAGGATGT CAACATCACT GTGGCCACTG 1380 

TGCCCTCCTG GCTGAAGAAG CCCCAAGACA GCCAGCTGGA GGAGGGCAAA CCCGGCTACT 1440 

TGGATTGCCT GACCCAGGCC ACACCAAAAC CTACAGTTGT CTGGTACAGA AACCAGATGC 1500 

TCATCTCAGA GGACTCACGG TTCGAGGTCT TCAAGAATGG GACCTTGCGC ATCAACAGCG 1560 

TGGAGGTGTA TGATGGGACA TGGTACCGTT GIATGAGCAG CACCCCAGCC GGCAGCATCG 1620 

AGGCGCAAGC CCGTGTCCAA GTGCTGGAAA AGCTCAAGTT CACACCACCA CCCCAGCCAC 1680 

AGCAGTGCAT GGAGTTTGAC AAGGAGGCCA CGGTGCCCTG TTCAGCCACA GGCCGAGAGA 1740 

AGCCCACTAT TAAGTGGGAA CGGGCAGATG GGAGCAGCCT CCCAGAGTGG GTGACAGACA 1800 

ACGCTGGGAC CCTGCATTTT GCCCGGGTGA CTCGAGATGA CGCTGGCAAC TACACTTGCA 1860 

TTGCCTCCAA CGGGCCGCAG GGCCAGATTC GTGCCCATGT CCAGCTCACT GTGGCAGTTT 1920 

TTATCACCTT CAAAGTGGAA CCAGAGCGTA CGACTGTGTA CCAGGGCCAC ACAGCCCTAC 1980 

TGCAGTGCGA GGCCCAGGGG GACCCCAAGC CGCTGATTCA GTGGAAAGGC AAGGAOCGCA 2040 

TCCTGGACCC CACCAAGCTG GGACCCAGGA TGCACATCTT CCAGAATGGC TCCCTGGTGA 2100 

TCCATGACGT GGCCCCTGAG GACTCAGGCC GCTACACCTG CATTGCAGGC AACAGCTGCA 2160 

ACATCAAGCA CACGGAGGCC CCCCTCTATG TCGTGGACAA GCCTGTGCCG GAGGAGTCGG 2220 

AGGGCCCTGG CAGCCCTCCC CCCTACAAGA TGATCCAGAC CATTGGGTTG TCGGTGGGTG 2280 

CCGCTGTCGC CTACATCATT GCCGTCCTGG GCCTCATGTT CTACTGCAAG AAGCGCTGCA 2340 

AAGCCAAGCG GCTGCAGAAG CAGCCCGAGG GCGAGGAGCC AGAGATGGAA TGCCTCAACG 2400 

GAGGGCCTTT GCAGAACGGG CAGCCCTCAG CAGAGATCCA AGAAGAAGTG GCCTTGACCA 2460 

GCTTGGGCTC CGGCCCOGCG GCCACCAACA AACGCCACAG CACAAGTGAT AAGATGCACT 2520 

TCCCACGGTC TAGCCTGCAG CCCATCACCA CGCTGGGGAA GAGTGAGTTT GGGGAGGTGT 2580 

TCCTGGCAAA GGCTCAGGGC TTGGAGGAGG GAGTGGCAGA GACCCTGGTA CTTGTGAAGA 2640 

GCCTGCAGAC GAAGGATGAG CAGCAGCAGC TGGACTTCCG GAGGGAGTTG GAGATGTTTG 2700 

GGAAGCTGAA CGACGCCAAC GTGGTGCGGC TCCTGGGGCT GTGCCGGGAG GCTGAGCOCC 2760 

ACTACATGGT GCTGGAATAT GTGGATCTGG GAGACCTCAA GCAGTTCCTG AGGATTTCCA 2820 

AGAGCAAGGA TGAAAAATTG AAGTCACAGC CCCTCAGCAC CAAGCAGAAG GTGGCCCTAT 2880 

GCACCCAGGT AGCCCTGGGC ATGGAGCACC TGTCCAACAA CCGCTTTGTG CATAAGGACT 2940 

TGGCTGCGCG TAACTGCCTG GTCAGTGCCC AGAGACAAGT GAAGGTGTCT GCCCTGGGCC 3000 

TCAGCAAGGA TGTGTACAAC AGTGAGTACT ACCACTTCCG CCAGGCCTGG GTGCCGCTGC 3060 

GCTGGATGTC CCCCGAGGCC ATCCTGGAGG GTGACTTCTC TACCAAGTCT GATGTCTGGG 3120 

CCTTCGGTGT GCTGATGTGG GAAGTGTTTA CACATGGAGA GATGCCCCAT GGTGGGCAGG 3180 

CAGATGATGA AGTACTGGCA GATTTGCAGG CTGGGAAGGC TAGACTTCCT CAGCCCGAGG 3240 

GCTGCCCTTC CAAACTCTAT CGGCTGATGC AGCGCTGCTG GGCCCTCAGC CCCAAGGACC 3300 

GGCCCTCCTT CAGTGAGATT GCCAGCGCCC TGGGAGACAG CACCGTGGAC AGCAAGCCGT 3360 

GAGGAGGGAG CCCGCTCAGG ATGGCCTGGG CAGGGGAGGA CATCTCTAGA GGGAAGCTCA 3420 
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CAGCATGATG GGCAAGATCC CTGTCCTCCT GGGCCCTGAG GTGCCCTAGT GCAACAGGCA 3480 

TTGCTGAGGT CTGAGCAGGG CCTGGCCTTT CCTCCTCTTC CTCACCCTCA TCCTTTGGGA 3540 

GGCTGACTTG GACCCAAACT GGGCGACTAG GGCTTTGAGC TGGGCAGTTT CCCCTGCCAC 3600 

CTCT TOCTCT ATCAGGGACA GTGTGGGTGC CACAGGTAAC CCCAATTTCT GGCCTTCAAC 3660 

TTCTCCCCTT GACCGGGTCC AACTCTGCCA CTCATCTGCC AACTTTGCCT GGGGAGGGCT 3720 

AGGCTTGGGA TGAGCTGGGT TTGTGGGGAG TTCCTTAATA TTCTCAAGTT CTGGGCACAC 3780 

AGGGTTAATG AGTCTCTTGC CCACTGGTCC ACTTGGGGGT CTAGACCAGG ATTATAGAGG 3840 

ACACAGCAAG TGAGTCCTCC CCACTCTGGG CTTGTGCACA CTGACCCAGA CCCACGTCTT 3900 

CCCCAOCCTT CTCTCCTTTC CTCATCCTAA GTGCCTGGCA GATGAAGGAG TTTTCAGGAG 3960 

CTTTTGACAC TATATAAACC GCCCTTTTTG TATGCACCAC GGGCGGCTTT TATATGTAAT 4020 

TGCAGOGTGG GGTGGGTGGG CATGGGAGGT AGGGGTGGGC CCTGGAGATG AGGAGGGTGG 4080 

GCCATOCTTA CCCCACACTT TTATTGTTGT CGTTTTTTGT TTGTTTTSTT TTTTTGTTTT 4140 
TGTTTTTGTT TTTACACTCG CTGCTCTCAA TAAATAAGCC TTTTTTA 



SEQ ID Mfr290 OBI6 Protein sequence 
Protein Accession #: NPJM2812 

1 11 21 31 41 51 

I I I I I I 

HGAARGSPAR PRRLPLLSVL LLPLLGGTQT AIVFIKQPSS QDALQGRRAL LRCEVEAPGP 60 

VHVYWLLDGA PVQOTERRFA QGSSLSFAAV DRLQDSGTFQ CVARDOVTGE EARSANASFN 120 

IKWIEAGPW LKHPASEAEI QPQTQVTLRC HIDGHPRPTY QWFRDGTPLS DGQSNHTVSS 180 

KERNLTLRPA GPEHSGLYSC CAHSAFGQAC SSQNFTLSIA DESFABWLA PQDVWAKYE 240 

EAMFHCQFSA QPPPSLQWLP BDETPITNRS RPPHLRRATV FANGSLLLTQ VRPRNAGIYR 300 

CIGQGQRGPP IILEATLHLA EIEDMPLFBP RVFTAGSEER VTCLPPKGLP EPSVWWEHAG 360 

VRLPTHGKVY QKGHELVLAK IAESDAGVYT CHAANLAGQR RQBVNITVAT VPSWLKKPQD 420 

SQLEEGKPGY LDCLTQATPK PTWWYRNQM LISEDSRFEV FKNGTLRZNS VEVYDGTWYR 480 

CMSSTPAGSI EAQARVQVLE KLKPTPPPQP QQCMEFDKEA TVPCSATGRE KPTIKWERAD 540 

GSSLPEWVTD NAGTLHFARV TRDDAGNYTC XASN6PQGQI RAHVQLTVAV FITFKVEPKR 600 

TTVYQGHTAL LQCEAQGDPK PLZQWKGKDR ILDPTKLGPR MHIFQNGSLV IHDVAPEDSG 660 

RYTCIAGNSC NIKHTEAPLY WDKPVPEES EGPGSPPPYK MIQTIGLSVG AAVAYIZAVL 720 

GLMFYCKKRC KAKRLOKQPE GESPEMECLN GGPLQNGQPS AEIQBEVALT SLGSGPAATN 780 

KRHSTSDKMH FPRSSLQPIT TLGKSEFGEV PLAKAQGLEE GVAETLVLVK SLQTKDEQQQ B40 

LDFRRELEMP GKLNHANWR LLGLCREAEP HYKVLZYVDL GDLKQFLRIS KSKDEKLKSQ 900 

PLSTKQKVAL CTQVALGMEH LSNNRFVHKD LAARNCLVSA QRQVKVSALG LSKDVYNSEY 960 

YHFRQAWVPL RWMSPEAILE GDPSTKSDVW AFGVLMWEVP THGEMPHGGQ ADDEVLADLQ 1020 
AGKARLPQPE GCPSKLYRLM QRCWALSPKD RPSPSEIASA LGDSTVDSKP 



SEQ ID N0:291 AAB1 0NA SEQUENCE 

Nucleic Acid Accession «: KM.002205 

Coding sequence: 1-3150 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I ! I 

ATGGGGAGCC GGACGCCAGA GTCCCCTCTC CACGCCGTGC AGCTGCGCTG GGGCCCCCGG 60 

CGCCGACCCC CGCTSSTGCC GCTGCTQTTQ CTGCTSSTGC CGCOGCCACC CAGGGTCGGG 120 

GGCTTCAACT TAGACGCGGA GGCCCCAGCA GTACTCTCGG GGCCCCCGGG CTCCTTCTTC 180 

GGATTCTCAG TGGAGTTTTA CCGGCCGGGA ACAGACGGGG TCAGTGTGCT GGTGGGAGCA 240 

CCCAAGGCTA ATACCAGCCA GCCAGGAGTG CTGCA QGGTG GTGCTGTCTA OCTCTGTCCT 300 

TGGGGTGCCA GCCCCACACA GTGCAOCCCC ATTGAATTTG ACAGCAAAGG CTCTCGGCTC 360 

CTGGAGTCCT CACTGTCCAG CTCAGAGGGA GAGGAGCCTG TGGAGTACAA GTCCTTGCAG 420 

TGGTTCGGGG CAACAGTTCG AGCCCATGGC TCCTCCATCT TGGCATGCGC TCCACTGTAC 480 

AGCTGGCGCA CAGAGAAGGA GCCACTGAGC GACCCCGTGG GCACCTGCTA CCTCTCCACA 540 

GAXAACTTCA CCCGAATTCT GGAGTATGCA CCCTGCCGCT CAGATTTCAG CTGGGCAGCA 600 

GGACAGGGTT ACTGCCAAGG AGGCTTCAGT GCOGAGTTCA CCAAGACTGG CCGTGTGGTT 660 

TTAGGTGGAC CAGGAAGCTA TTTCTGGCAA GGCCAGATCC TGTCTGCCAC TCAGGAGCAG 720 

ATTGCAGAAT CTTATTACCC CGAGTACCTG ATCAACCTGG TTCAGGGGCA GCTGCAGACT 780 

CGCCAGGCCA GTTCCATCTA TGATGACAGC TACCTAGGAT ACTCTGTGGC TGTTGGTGAA 840 

TTCAGTGGTG ATGACACAGA AGACTTTGTT GCTGGTGTGC CCAAAGGGAA CCTCACTTAC 900 

GGCTATGTCA CCATCCTTAA TGGCTCAGAC ATTCGATCCC TCTACAACTT CTCAGGGGAA 960 

CAGATGGCCT CCTACTTTGG CTATGCAGTG GCCGCCACAG ACGTCAATGG GGACGGGCTG 1020 

GATGACTTGC TGGTGGGGGC ACCCCTGCTC ATGGATCGGA CCCCTGAGGG GCGGCCTCAG 1080 

GAGGTGGGCA GGGTCTACGT CTACCTGCAG CACCCAGCCG GCATAGAGCC CACGCCCACC 1140 

CTTACCCTCA CTGGCCATGA TGAGTTTGGC CGATTTGGCA GCTCCTTGAC CCCCCTGGGG 1200 

GACCTGGACC AGGATGGCTA CAATGATGTG GCCATCGGGG CTCCCTTTGG TGGGGAGACC 1260 

CAGCAGGGAG TAGTGTTTGT ATTTCCTGGG GGCOCAGGAG GGCTGGGCTC TAAGCCTTCC 1320 

CAGGTTCTGC AGCCCCTGTG GGCAGCCAGC CACACCCCAG ACTTCTTTGG CTCTGCCCTT 1380 

CGAGGAGGOC GAGACCTGGA TGGCAATGGA TATCCTGATC TGATTGTGGG GTCCTTTGGT 1440 

GTGGACAAGG CTGTGGTATA CAGGGGCCGC CCCATCGTGT CCGCTAGTGC CTCCCTCACC 1500 

ATCTTCCCCG CCATGTTCAA CCCAGAGGAG CGGAGCTGCA GCTTAGAGGG GAACCCTGTG 1560 

GCCTGCATCA ACCTTAGCTT CTGCCTCAAT GCTTCTGGAA AACACGTTGC TGACTCCATT 1620 

GGTTTCACAG TGGAACTTCA GCTGGACTGG CAGAAGCAGA AGGGAGGGGT ACGGCGGGCA 1680 

CTGTTCCTGG CCTCCAGGCA GGCAACCCTG ACCCAGACCC TGCTCATCCA GAATGGGGCT 1740 

CGAGAGGATT GCAGAGAGAT GAAGATCTAC CTCAGGAACG AGTCAGAATT TCGAGACAAA 1800 

CTCTCGCCGA TTCACATCGC TCTCAACTTC TCCTTGGACC CCCAAGCCCC AGTGGACAGC 1860 

CACGGCCTCA GGCCAGCCCT ACATTATCAG AGCAAGAGCC GGATAGAGGA CAAGGCTCAG 1920 

ATCTTGCTGG ACTGTGGAGA AGACAACATC TGTGTGCCTG ACCTGCAGCT GGAAGTGTTT 1980 
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GGGGAGCAGA ACCATGTGTA CCTGGGTGAC AAGAATGCCC TGAACCTCAC TTTCCATGCC 2040 

CAGAATGTGG 6TGAG6GTGG CGCCTATGAG GCTGAGCTTC GGGTCACCGC COCTCCAGAG 2100 

GCTGAGTACT CAGGACTCGT CAGACACCCA GGGAACTTCT CCAGCCTGAG CTGTGACTAC 2160 

TTTGCCGTGA ACCAGAGCCG CCTGCTGGTG TGTGACCTGG GCAACCCCAT GAAGGCAGGA 2220 

GCCAGTCTGT GGGGTGGCCT TCGGTTTACA GTCCCTCATC TCCGGGACAC TAAGAAAACC 2280 

ATCCAGTTTG ACTTCCAGAT CCTCAGCAAG AATCTCAACA ACTCGCAAAG CGACGTGGTT 2340 

TCCTTTCGGC TCTCCGTGGA GGCTCAGGCC CAGGTCACCC TGAACGGTGT CTCCAAGCCT 2400 

GAGGCAGTGC TATTOCCAGT AAGCGACTGG CATCCCCGAG ACCAGCCTCA QAAGGAGGAG 2460 

GACCTGGGAC CTGCTQTCCA CCATGTCTAT GAGCTCATCA ACCAAGGCCC CAGCTCCATT 2520 

AGCCAGGGTG TGCTGGAACT CAGCTGTCCC CAGGCTCTGG AAGGTCAGCA GCTCCTATAT 25B0 

GTGACCAGAG TTACGGGACT CAACTGCACC AGCAATCACC CCATTAACCC AAAGGGCCTG 2640 

GAGTTGGATC CCGAGGGTTC CCTGCACCAC CAGCAAAAAC GGGAAGCTCC AAGCCGCAGC 2700 

TCTGCTTCCT CGGGACCTCA GATCCTGAAA TGCCCGGAGG CTGAGTGTTT CAGGCTGCGC 2760 

TGTGAGCTCG GGCCCCTGCA CCAACAAGAG AGCCAAAGTC TGCAGTTGCA TTTCCGAGTC 2820 

TGGGCCAAGA CTTTCTTGCA GCGGGAGCAC CAGCCATTTA GCCTCCAGTG TGAGGCTGTG 2880 

TACAAAGCCC TGAAGATGCC CTACCGAATC CTGCCTCGGC AGCTGCCCCA AAAAGAGCGT 2940 

CAGGTGGCCA CAGCTGTGCA ATGGACCAAG GCAGAAGGCA GCTATGGCGT CCCACTGTGG 3000 

ATCATCATCC TAGCCATCCT GTTTGGCCTC CTGCTCCTAG GTCTACTCAT CTACATCCTC 3060 

TACAAGCTTG GATTCTTCAA ACGCTCCCTC CCATATGGCA CCGCCATGGA AAAAGCTCAG 3120 
CTCAAGCCTC CAGCCACCTC TGATGC CTGA 



Protein Accession*: NPJJQ2106 

1 11 21 31 41 51 

I I I I I I 

HGSRTPESFL HAVQLRWGPR RRPPLLPLLL LLLPPPPRVG GFNLDAEAPA VLSGPPGSFF 60 

GFSVEFYRPG TOGVSVLVGA PKANTSQPGV LQGGAVYLCP WGASPTQCTP IEFDSKGSRL 120 

LBSSLSSSEG EBPVEVRSLQ WFGATVRAHG SSILACAPLY SWRTEKEPLS DFVGTCYLST 180 

DNFTRILEYA PCRSDFSWAA GQGYCQGGFS AEFTKTGKW LGGPGSYFWQ GQILSATQBQ 240 

IAESYYPHYL INLVQGQLQT RQASSIYDDS YLGYSVAVGE FSGDDTEDFV AGVPKGNLTY 300 

GYVTCLHGSD IRSLYNFSGE QMASYFGYAV AATDVNGDGL DDLLVGAPLL MDRTPDGRPQ 360 

EVGRVYVYLQ HPAGIEPTPT LTLTGHDEFG RFGSSLTPLG DLDQDGYNDV AIGAPFGGET 420 

QQGWFVFPG GPGGLGSKPS QVLQPLWAAS HTPDFFGSAL RGGRDLDGNG YPDLIVGSFG 480 

VDKAWYRGR PIVSASASLT IFPAMFNPEE RSCSLEGNPV ACDJLSFCLK ASGKHVADSI 540 

GFTVELQLDW QKQKGGVRRA LFLASRQATL TQTLLIQNGA REDCREMKIY LRNESEFRDK 600 

LSPIHIA&NF SLDPQAPVDS HGLRPALHYQ SKSRIEDKAQ ILLDCGEDNI CVPDLQLEVF 660 

GBQNHVYXiGD KNALNLTFHA QNVGEGGAYE AELRVTAPPE AEYSGLVRHP GNFSSLSCDY 720 

FAVNQSRLLV CDLGNFMKAG ASLWGGLRFT VPHLRDTKKT IQFDFQILSK NLNNSQSDW 780 

SPRLSVEAQA QVTLNGVSKP EAVLFPVSDW HPRDQPQKEE DLGPAVHHVY ELINQGPSSI 840 

SQGVLELSCP QALBGQQLLY VTRVTGLNCT TNHPINPKGL ELDPEGSLHH QQKREAPSRS 900 

SASSGPQILK CPBAECFRXiR CELGPLHQQS SQSLQLHFRV WAKTFLQREH QPPSLQCKAV 960 

YKALKMPYRI LPRQLPQKER QVATAVQWTK AEGSYGVPLW IIILAILFGL LLLGLLIYIL 1020 
YKLGFPKRSL PYGTAMEKAQ LKPPATSDA 



SEQ ID NCh293 LBH4 DNA SEQUENCE 

Nudete Add Accession* BC001291 

Coding sequence: 44-541 (start and stop codons are undertined) 



1 11 21 31 41 51 
I I I I I I 

GGGGGGGGOG CGCGCTGACC CTOCCTGGGC ACCGCTGGGG ACGAJjQGOGC TGCTCGCCTT 60 
GCTGCTGOTC GTGGCCCTAC CGCGGGTGTO GACAGACGCC AACCTGACTO COAGACAACG 120 
AGATCCAGAG GACTCCCAGC GAACGGACGA GGGTGACAAT AGAGTCTGGT GTCATGTTTG 180 
TGAGAG AGAA AACACTTTCG AGTGCCAG AA CCCAAGGAGG TGCAAATGGA CAGAGCCATA 240 
CTGCGTTATA GOGGCCGTGA AAATATTTCC ACGTTTTTTC ATGGTTGCGA AGCAGTGCTC 300 
CCOGGTTGT GCAGCGATGG AGAGAGGCAA GCCAGAGGAG AAGCGGTTTC TGCTGGAAGA 360 
GCCCATGCCC TTCTTTTACC TCAAGTGTTG TAAAATTCGC TACTGCAATT TAGAGGGGCC 420 
AOCTATCAAC TCATCAGTGTTCAAAGAATA TGCTGGGAGC ATGGGTGAGA GCTGTGGTGG 480 
GCTGTGGCTG GCCATCCTCC TGCTGCTGGC CTCCATTGCA GCCGGCCTCA GCCTGTCTTG. 540 
AGCCACGGGA CTGCCACAGA CTGAGCCTTC CGGAGCATGG ACTC GCTCCA GACCGTTGTC 600 
ACCTGTTGCA TTAAACTTGT TrTCTGTTGA TTACCTCTTG GTTTGACrrC CCAGGGTCTT 660 
GGGATGGGAG AGTGGGGATC AGGTGCAGTT GGCTCTTAAC CCTCAAGGGT TCTTTAACTC 720 
ACATTCAGAG GAAGTCCAGA TCTCCTGAGT AGTGATTTTG GTGACAAGTT 111L1CU1U 780 
AAATCAAACC TTGTAACTCA TTTATTGCTG ATGGCCACTC TTTTCCTTGA CTCCCCTCTG 840 
CCTCTGAGGG CTTCAGTATT GATGGGGAGG GAGGCCTAAG TACCACTCAT GGAGAGTATG 900 
TGCTG AGATG CTTOCGACCT TTCAGGTGAC GCAGGAACAC TGGGGGAGTC TGAATGATTG 960 
GGGTGAAG AC ATOCCTGG AG TGAAGGACTC CTCAGCATGG GGGGCAGTGG GGCACACGTT 1020 
AGGGCTGCCC CCATTCCAGT GGTGGAGGCG CTGTGGATOG C 1GC T 1 1 1 CC TCAACCTTTC 10 B0 
CTAOCAGATT CCAGGAGGCA GAAGATAACT AATTOTGTTG AAGAAACTTA GACTTCACCC 1 140 
ACCAGCTGGC ACAGGTGCAC AG ATTCATAA ATTCCCACAC GTGTGTGTTC AACATCTGAA 1200 
ACTTAGGCCA AGTAGAGAGC ATCAGGGTAA ATGGCGTTCA 1 1 IClCHiTT AAGATGCAGC 1260 
CATCCATGGG GAGCTGAGAA ATCAGACTCA AAGTTCCACC AAAAACAAAT ACAAGGGGAC 1320 
TTCAAAAGTT CACG AAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAA 
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SHIP WWW tfmPmWngwwnw; 

Protein Accession*: AAH01291 



5 l 11 21 31 41 51 
( ( I [ I f 

MALLALLLVV ALPRVWTDAN LTARQRDPED SQRTDEGDNR VWCHVCEREN TFBCQNPRRC 60 
KWTEPYCVIA AVKIFPRFFM VAKQCSAGCA AMERPKPEEK RFLLEEPMPF FYLKCCKIR Y 120 
CNLEGPPINS SVFKEYAGSM GESCGGLWLA ILLUASIAA GLSLS 

10 



15 

It is understood that the examples described above in no way serve to limit the 
true scope of this invention, but rather are presented for illustrative purposes. All 
publications, sequences of accession numbers, and patent applications cited in this 
specification are herein incorporated by reference as if each individual publication or patent 
20 application were specifically and individually indicated to be incorporated by reference. 



427 



WO 02/30268 



PCT/USO 1/32045 



WHAT IS CLAIMED IS: 

1 1 . A method of detecting a prostate cancer-associated transcript in a cell 

2 from a patient, the method comprising contacting a biological sample from the patient with a 

3 polynucleotide that selectively hybridizes to a sequence at least 80% identical to a sequence 

4 as shown in Tables 1-16. 

1 2. The method of claim 1 , wherein the polynucleotide selectively 

2 hybridizes to a sequence at least 95% identical to a sequence as shown in Tables 1-16. 

1 3. The method of claim 1, wherein the biological sample is a tissue 

2 sample. 

1 4. The method of claim 1 , wherein the biological sample comprises 

2 isolated nucleic acids. 

1 5. The method of claim 4, wherein the nucleic acids are mRNA. 

1 6. The method of claim 4, further comprising the step of amplifying 

2 nucleic acids before the step of contacting the biological sample with the polynucleotide, 

1 7. The method of claim 1, wherein the polynucleotide comprises a 

2 sequence as shown in Tables 1-16. 

1 8. The method of claim 1, wherein the polynucleotide is labeled 

1 9. Hie method of claim 8, wherein the label is a fluorescent label. 

1 10. The method of claim 1, wherein the polynucleotide is immobilized on 

2 a solid surface. 

1 11. The method of claim 1 , wherein the patient is undergoing a therapeutic 

2 regimen to treat prostate cancer. 

1 12. The method of claim 1, wherein the patient is suspected of having 

2 prostate cancer. 
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1 13. A method of monitoring the efficacy of a therapeutic treatment of 

2 prostate cancer, the method comprising the steps of: 

3 (i) providing a biological sample from a patient undergoing the therapeutic 

4 treatment; and 

5 (ii) determining the level of a prostate cancer-associated transcript in the 

6 biological sample by contacting the biological sample with a polynucleotide that selectively 

7 hybridizes to a sequence at least 80% identical to a sequence as shown in Tables 1-16, 

8 thereby monitoring the efficacy of the therapy. 

1 14. The method of claim 13, further comprising the step of: (iii) comparing 

2 the level of the prostate cancer-associated transcript to a level of the prostate cancer- 

3 associated transcript in a biological sample from the patient prior to, or earlier in, the 

4 therapeutic treatment. 

1 15. The method of claim 13, wherein the patient is a human. 

1 16. A method of monitoring the efficacy of a therapeutic treatment of 

2 prostate cancer, the method comprising the steps of: 

3 (i) providing a biological sample from a patient undergoing the therapeutic 

4 treatment; and 

5 (ii) determining the level of a prostate cancer-associated antibody in the 

6 biological sample by contacting the biological sample with a polypeptide encoded by a 

7 polynucleotide that selectively hybridizes to a sequence at least 80% identical to a sequence 

8 as shown in Tables 1-16, wherein the polypeptide specifically binds to the prostate cancer- 

9 associated antibody, thereby monitoring the efficacy of the therapy. 

1 17. The method of claim 16, further comprising the step of: (iii) comparing 

2 the level of the prostate cancer-associated antibody to a level of the prostate cancer- 

3 associated antibody in a biological sample from the patient prior to, or earlier in, the 

4 therapeutic treatment. 

1 18. The method of claim 16, wherein the patient is a human. 
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1 19. A method of monitoring the efficacy of a therapeutic treatment of 

2 prostate cancer, the method comprising the steps of: 

3 (i) providing a biological sample from a patient undergoing the therapeutic 

4 treatment; and 

5 (ii) determining the level of a prostate cancer-associated polypeptide in the 

6 biological sample by contacting the biological sample with an antibody, wherein the antibody 

7 specifically binds to a polypeptide encoded by a polynucleotide that selectively hybridizes to 

8 a sequence at least 80% identical to a sequence as shown in Tables 1-16, thereby monitoring 

9 the efficacy of the therapy. 

1 20. The method of claim 19, further comprising the step of: (iii) comparing 

2 the level of the prostate cancer-associated polypeptide to a level of the prostate cancer- 

3 associated polypeptide in a biological sample from the patient prior to, or earlier in, the 

4 therapeutic treatment. 

21 . The method of claim 19, wherein the patient is a human. 

22. An isolated nucleic acid molecule consisting of a polynucleotide 
sequence as shown in Tables 1-16. 

23. The nucleic acid molecule of claim 22, which is labeled 

24. The nucleic acid of claim 23, wherein the label is a fluorescent label 

25. An expression vector comprising the nucleic acid of claim 22. 

26. A host cell comprising the expression vector of claim 25. 

27. An isolated polypeptide which is encoded by a nucleic acid molecule 
having polynucleotide sequence as shown in Tables 1-16. 

28. An antibody that specifically binds a polypeptide of claim 27. 

29. The antibody of claim 28, further conjugated to an effector component. 
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1 30. The antibody of claim 29, wherein the effector component is a 

2 fluorescent label. 

1 31. The antibody of claim 29, wherein the effector component is a 

2 radioisotope or a cytotoxic chemical. 

1 32. The antibody of claim 29, which is an antibody fragment. 

1 33. The antibody of claim 29, which is a humanized antibody 

1 34. A method of detecting a prostate cancer cell in a biological sample 

2 from a patient, the method comprising contacting the biological sample with an antibody of 

3 claim 28. 

1 35. The method of claim 34, wherein the antibody is further conjugated to 

2 an effector component. 

1 36. The method of claim 35, wherein the effector component is a 

2 fluorescent label. 

1 37. A method of detecting antibodies specific to prostate cancer in a 

2 patient, the method comprising contacting a biological sample from the patient with a 

3 polypeptide encoded by a nucleic acid comprises a sequence from Tables 1-16. 

1 38. A method for identifying a compound that modulates a prostate cancer- 

2 associated polypeptide, the method comprising the steps of; 

3 (i) contacting the compound with a prostate cancer-associated polypeptide, the 

4 polypeptide encoded by a polynucleotide that selectively hybridizes to a sequence at least 

5 80% identical to a sequence as shown in Tables 1-16; and 

6 (ii) determining the functional effect of the compound upon the polypeptide. 

1 39. The method of claim 38, wherein the functional effect is a physical 

2 effect. 
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1 40. The method of claim 38, wherein the functional effect is a chemical 

2 effect. 

1 41. The method of claim 38, wherein the polypeptide is expressed in a 

2 eukaryotic host cell or cell membrane. 

1 42. The method of claim 38, wherein the functional effect is determined by 

2 measuring ligand binding to the polypeptide. 

1 43. The method of claim 38, wherein the polypeptide is recombinant. 

1 44. A method of inhibiting proliferation of a prostate cancer-associated 

2 cell to treat prostate cancer in a patient, the method comprising the step of administering to 

3 the subject a therapeutically effective amount of a compound identified using the method of 

4 claim 38. 

1 45. The method of claim 44, wherein the compound is an antibody. 

1 46. The method of claim 45, wherein the patient is a human. 

1 47. A drug screening assay comprising the steps of 

2 (i) administering a test compound to a mammal having prostate cancer or a 

3 cell isolated therefrom; 

4 (ii) comparing the level of gene expression of a polynucleotide that selectively 

5 hybridizes to a sequence at least 80% identical to a sequence as shown in Tables 1-16 in a 

6 treated cell or mammal with the level of gene expression of the polynucleotide in a control 

7 cell or mammal, wherein a test compound that modulates the level of expression of the 

8 polynucleotide is a candidate for the treatment of prostate cancer. 

1 48. The assay of claim 47, wherein the control is a mammal with prostate 

2 cancer or a cell therefrom that has not been treated with the test compound. 

1 49. The assay of claim 47, wherein the control is a normal cell or mammal. 
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1 50. A method for treating a mammal having prostate cancer comprising 

2 administering a compound identified by the assay of claim 47. 

1 5 1 . A pharmaceutical composition for treating a mammal having prostate 

2 cancer, the composition comprising a compound identified by the assay of claim 47 and a 

3 physiologically acceptable excipient. 

1 52. The method according to claim 1, wherein said biological sample is 



2 contacted with a plurality of polynucleotides comprising a first polynucleotide that 

3 selectively hybridizes to a sequence at least 80% identical to a first sequence as shown in 

4 Tables 1-16; and a second polynucleotide that selectively hybridizes to a second sequence at 

5 least 80% identical to a second sequence as shown in Tables 1-16. 



1 53. A method according to claim 52, wherein the plurality of 

2 polynucleotides comprises a third polynucleotide that selectively hybridizes to a sequence at 

3 least 80% identical to a third sequence as shown in Tables 1-16.. 

1 54. A method of detecting a prostate cancer associated transcript, the 

2 method comprising contacting a biological sample from the patient with a plurality of 

3 polynucleotides wherein at least two of said polynucleotides selectively hybridize to a 

4 difference sequence at least 80% identical to a sequence as shown in Tables 1-16. 

1 55. A method of detecting a prostate cancer, the method comprising the 

2 steps of: 

3 (i) providing a biological sample from a patient; 

4 (ii) contacting the biological sample with a first polynucleotide that selectively 



5 hybridizes to a sequence at least 80% identical to a first sequence as shown in Tables 1-16 to 

6 determine the level of a prostate cancer-associated transcript in the biological sample; and 

7 with a second polynucleotide that selectively hybridizes to a second sequence at least 80% 

8 identical to a sequence not shown in Tables 1-16; wherein the expression of said second 

9 sequence is not substantially changed in prostate cancer, to detemine the level of expression 
10 of a control transcript in the biological sample; 
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1 1 (iii) comparing the level of the prostate cancer-associated transcript to a level 

12 of the normal tissue associated transcript in the biological sample. 

1 56. A method of quantitating a prostate cancer-associated transcript in a 

2 cell from a patient, the method comprising contacting a biological sample from the patient 

3 with a polynucleotide that selectively hybridizes to a sequence at least 80% identical to a 

4 sequence as shown in Tables 1-16. 

1 57. The method of claim 56, wherein the polynucleotide selectively 

2 hybridizes to a sequence at least 95% identical to a sequence as shown in Tables 1-16. 

1 58. The method of claim 56, wherein the biological sample is a tissue 

2 sample. 

1 59. The method of claim 56, wherein the biological sample comprises 

2 isolated nucleic acids. 

1 60. The method of claim 56, wherein the nucleic acids are mRNA. 

1 61 . The method of claim 59, further comprising the step of amplifying 

2 nucleic acids before the step of contacting the biological sample with the polynucleotide. 

1 62. The method of claim 56, wherein the polynucleotide comprises a 

2 sequence as shown in Tables 1-16. 

1 63. The method of claim 56, wherein the polynucleotide is labeled. 

1 64. The method of claim 63, wherein the label is a fluorescent label. 

1 65 . The method of claim 56, wherein the polynucleotide is immobilized on 

2 a solid surface. 

1 . 66. The method of claim 56, wherein the patient is undergoing a 

2 therapeutic regimen to treat metastatic prostate cancer. 

1 67. The method of claim 56, wherein the patient is suspected of having 

2 metastatic prostate cancer. 

434 



WO 02/30268 PCT/USO 1/32045 

1 68. A biochip comprising a plurality of polynucleotides that selectively 

2 hybridize to a sequence at least 80% identical to a sequence as shown in Tables 1-16. 

1 69. A method of screening drug candidates comprising: 

2 i) providing a cell that expresses an expression profile gene selected from the 

3 group consisting of an expression profile gene set forth in Tables 1-16 or fragment thereof; 

4 ii) adding a drug candidate to said cell; and 

5 iii) determining the effect of said drug candidate on the expression of said 

6 expression profile gene. 

1 70. A method according to claim 59 wherein said determining comprises 

2 comparing the level of expression in the absence of said drug candicfete to the level of 

3 expression in the presence of said drug candidate. 

1 SF 1277890 vl 
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